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Abstract 

The predominant knowledge-based approach to automated model construction, compositional 
modelling, employs a set of models of particular functional components. Its inference mechanism 
takes a scenario describing the constituent interacting components of a system and translates it into 
a useful mathematical model. This paper presents a novel compositional modelling approach aimed 
at building model repositories. It furthers the field in two respects. Firstly, it expands the appli- 
cation domain of compositional modelling to systems that can not be easily described in terms of 
interacting functional components, such as ecological systems. Secondly, it enables the incorpora- 
tion of user preferences into the model selection process. These features are achieved by casting the 
compositional modelling problem as an activity-based dynamic preference constraint satisfaction 
problem, where the dynamic constraints describe the restrictions imposed over the composition of 
partial models and the preferences correspond to those of the user of the automated modeller. In 
addition, the preference levels are represented through the use of symbolic values that differ in 
orders of magnitude. 

1. Introduction 

Mathematical models form an important aid in understanding complex systems. They also help 
problem solvers to capture and reason about the essential features and dynamics of such systems. 
Constructing mathematical models is not an easy task, however, and many disciplines have con- 
tributed approaches to automate it. Compositional modelling (Falkenhainer & Forbus, 1991; Kep- 
pens & Shen, 2001b) is an important class of approaches to automated model construction. It uses 
predominantly knowledge-based techniques to translate a high level scenario into a mathematical 
model. The knowledge base usually consists of generic fragments of models that provide one of 
the possible mathematical representation of a process that occurs in one or more components. The 
inference mechanisms instantiate this knowledge base, search for the most appropriate selection of 
model fragments, and compose them into a mathematical model. Compositional modelling has been 
successfully applied to a variety of application domains ranging from simple physics, over various 
engineering problems to biological systems. 

The present work aims at a compositional modelling approach for building model repositories 
of ecological systems. In the ecological modelling literature, a range of models have been devised 
to formally characterise the various phenomena that occur in ecological systems. For example, 
the logistic growth (Verhulst, 1838) and the Rolling predation (Rolling, 1959) models describe the 
changes in the size of a population. The former expresses changes due to births and deaths and the 
latter changes due to one population feeding on another. A compositional model repository aims 
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to make such (partial) models more generally usable by providing a mechanism to instantiate and 
compose them into larger models for more complex systems involving many interacting phenomena. 

Thus, the input to a compositional model repository is a scenario describing the configuration 
of a system to be modelled. A sample scenario may include a number of populations and various 
predation and competition relations between them. The output is a mathematical model, called a 
scenario model, representing the behaviour of the system specified in the given scenario. A set of 
differential equations describing the changes in the population sizes in the aforementioned scenario 
due to births, natural deaths, deaths because of predation, available food supply or competition 
would constitute such a scenario model. 

This application domain poses three important new challenges to compositional modelling. 
Firstly, the processes and components of an ecological system that are to be represented in the 
resulting composed model depend on one another and on the ways they are described. In popu- 
lation dynamics for example, models describing the predation or competition phenomena between 
two populations rely on the existence of a population growth model for each of the populations 
involved in the phenomenon. This inhibits the conventional approach of searching for a consis- 
tent and adequate combination of partial models, one for each component in the scenario. This 
approach provides an adequate solution for physical systems because these are comprised of com- 
ponents implementing a particular functionality that can be described by one or multiple partial 
models. Although the seminal work on compositional modelling (Falkenhainer & Forbus, 1991) 
recognised the existence of more complex interdependencies in model construction in general, it 
provided only a partial solution for it: all the conditions under which certain modelling choices 
were relevant had to be specified manually in the knowledge base. 

Secondly, the domain of ecology lacks a complete theory of what constitutes an adequate model. 
Most existing compositional modellers are based on a predefined concept of model adequacy. They 
employ inference mechanisms that are guaranteed to find a model that meets such adequacy criteria. 
However, criteria to determine how adequate an ecological model may be vary between ecological 
domains and even between the ecologists that require the model within the same domain. Therefore, 
the compositional modeller requires a facility to define the properties that the generated ecological 
models must satisfy. 

Thirdly, it is not possible to express all the criteria imposed on the scenario model in terms of 
hard requirements. Often, ecological models that describe mechanisms and behaviours are only par- 
tially understood. In such cases, the choice of one model over another becomes a matter of expert 
opinion rather than pure theory. Therefore, in the ecological domain, modelling approaches and pre- 
sumptions are, to some extent, selected based on preferences. Existing compositional modellers are 
not equipped to deal with such user preferences and this paper presents the very first compositional 
modeller that incorporates them. 

Generally speaking, the above three issues are tackled in this paper by means of a method to 
translate the compositional modelling problem into an activity-based dynamic preference constraint 
satisfaction problem (aDPCSP) (Keppens & Shen, 2002). An aDPCSP integrates the concept of 
activity-based dynamic constraint satisfaction problem (aDCSP) (Miguel & Shen, 1999; Mittal & 
Falkenhainer, 1990) with that of order-of-magnitude preferences (Keppens & Shen, 2002). The 
attributes and domains of this aDPCSP correspond to model design decisions, with constraints de- 
scribing the restrictions imposed by consistency requirements and properties and order-of-magnitude 
preferences describing the user's preferences on modelling choices. The translation method brings 
the additional advantage that compositional modelling problems can now be solved by means of 
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efficient aDCSP tecliniques. As such, compositional modellers can benefit from recent and future 
advances in constraint satisfaction research. 

The remainder of this paper is organised as follows. Section 2 introduces the concept of an 
aDPCSP, a preference calculus that is suitable to express subjective user preferences for model 
design decisions and to be integrated with the general framework of aDPCSPs. It also gives a 
solution algorithm for aDPCSPs. Next, section 3 presents the compositional model repository and 
shows how such an aDPCSP is employed for automated model construction. These theoretical 
ideas are then illustrated by means of a large example in section 4, applying the compositional 
model repository to population dynamics problems. Section 5 concludes this paper with a summary 
and an outline of further research. 

2. Dynamic Constraint Satisfaction with Order-of-Magnitude Preferences 

In this section, a preference calculus based on order-of-magnitude reasoning is introduced and inte- 
grated into the activity-based dynamic constraint satisfaction problem (aDCSP) to form an aDCSP 
with order-of-magnitude preferences (aDPCSP). Then, a solution algorithm for such aDPCSPs is 
presented. The theory is illustrated with examples from the compositional modelling domain. 

2.1 Background: Activity-based dynamic preference constraint satisfaction 

A hard constraint satisfaction problem (CSP) is a tuple (X, D, C), where 

• X = {xi, . . . , Xn} is a vector of n attributes, 

• D = {Z?xn • • • , Dx^} is a vector containing exactly one domain for each attribute in X. 
Each domain Dx € D is a set of values {dn, . . . , (im^} that may be assigned to the attribute 
corresponding to the domain. 



C is a set of compatibility constraints. A compatibility constraint Ci^x^,...,x } £ C defines a 
relation over a subset of the domains Dx^, ■■■, Dx^, and hence C{^. ^ } C Z)^. x . . . x Dxj. 



A solution to a hard constraint satisfaction problem is any tuple {xi : dx^ , . . . ,Xn '■ dx„) such 
that 

• each attribute is assigned a value from its domain: Vxj G X, dx^ G Dx^, and 

• all compatibility constraints are satisfied: Vxj^. ^.} € C, {dx^, . ■ ■ , dx^) G C{xi,...,x}- 

An activity-based dynamic CSP (aDCSP), originally proposed in by Mittal and Falkenhainer 
(1990), extends conventional CSPs with the notion of activity of attributes. In an aDCSP, not all 
attributes are necessarily assigned in a solution, but only the active ones. As such, each attribute is 
either active and assigned a value or inactive: 

Vxj G X, {3dxi ^ Dx„Xi : dx,) ^ active(xj) 

The activity of attributes in an aDCSP is governed by activity constraints that enforce under which 
assignments of attributes, an assignment to another attribute is relevant or possible. This information 
is important because it not only dictates for which attributes a value must be searched, but also the 
set of compatibility constraints that must be satisfied. Clearly, only the compatibility constraints 
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'^{xi,...,x } ^ ^ foJ" which all attributes Xj, . . . , Xj are active must be satisfied, and a hard CSP is a 
sub-type of aDCSP in which all attributes are always active. 

In summary, an activity-based dynamic constraint satisfaction problem (aDCSP) is a tuple 

(X,D,C,A),where 

• (X,D,C)isahardCSPand 

• A is a set of activity constraints. An activity constraint restricts the sets of attribute-value 
assignments under which an attribute is active or inactive: 

ax„{xj,...,x^} ^ -Da;, X . . . X Da;fe X {active(xi), ^active(xi)} 

where Xj ^ {xj, . . . , Xk}- 

A solution to an activity-based dynamic constraint satisfaction problem is any tuple (xi : 

dx^,...,xi : dxi) such that 

• the attributes that are part of the solution are assigned a value from their domain: Vxj € 

{xi,...,xi},dx^ € Dx^, 

• all activity constraints are satisfied: 

'^ax^,{xj,...,xk} ^ A, {xj ^ {xi,...,x/}) V ... V (xfc {xi,...,Xi})v 
(xi G {xi,...,xj A ((ij,^-,...,4^,active(xi)) G a^,^{a;,-,...,a.^}) V 
(xj ^ {xi,...,xj A {dxj, ■ . . ,dx^,,^^cti\&{xi)) G ax^^{x^,...,Xk}) 
and 

• all compatibility constraints are satisfied: 

'^C{x,,...,x,} G C, ^active(xi) V ... V ^active(xj) V (4,, • • • , 4,) G C{x„...,xj} 

2.2 Order-of-magnitude preferences (OMPs) 

Although an aDCSP can capture the hard constraints over decisions in a given problem as well as 
their dynamically changing solution space (as described by the activity constraints), the represen- 
tation scheme it employs does not take into account any preferences users may have over possible 
alternative value assignments. Therefore, this work is extended to allow preference information to 
be attached to attribute-value assignments. The way in which this can be achieved depends on the 
representation and reasoning mechanisms underlying the preference calculus. In general, a prefer- 
ence calculus can be defined as a tuple (P, ©, =^) where: 

• P is the set of preferences, 

• © is a commutative, associative operator that is closed in P, and 

• ^ forms a partial order, that is, reflexive, anti-symmetric and transitive relation defined over 

P X P. 

Because ^ is reflexive, antisymmetric and transitive, comparing preferences with the ^ relation 
yields one of four possible results: 
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• Two preferences i-*i, P2 £ P are equal to one another (denoted Pi = P2) iff Pi =^ P2 and 
P2 ^ Pi. 

• A preference Pi € P is strictly greater than a preference P2 G P (denoted Pi >- P2) iff 
Pi ^ P2 and P2 =^ Pi. 

• A preference Pi G P is strictly smaller than a preference P2 G P (denoted Pi -< P2) iff 
Pi =^ P2 and P2 =^ Pi. 

• Two preferences Pi , P2 G P are incomparable with one another (denoted Pi ?P2) iff Pi 7^ P2 
and P2 7^ Pi. 

Thus, an activity-based dynamic preference constraint satisfaction problem (aDPCSP) is a tuple 

(X, D, C, A, (P, e, ^), P) where 

• (X,D,C,A)isanaDCSP, 

• (P, ©, ^) is a preference calculus, and 

• P is a mapping D^-^ U . . . U Dx„ 1— > P from the individual attribute-value assignments to the 
preferences. 

The preferences attached to attribute- value assignments express the relative desirability of these 
assignments. The aim of the aDPCSP is to find a solution with the highest combined preference. 
That is, given an aDPCSP (X, D, C, A, (P, ©, ^), P), any solution {xi : d^^, ■ ■ . , Xj : d^ ) oi the 
aDCSP (X, D, C, A) such that no other solution {xk : dx,,, ■ ■ ■ ,xi : dxi) oi (X, D, C, A) exists 
with P(xj : da;J©. . .©P(xj : dxj) -< Pixk : d^;^)©. . .©P(xi : d^J is aw/M^/on to the aDPCSP. 

In this section, a preference calculus is introduced to extend an aDCSP into an aDPCSP. The 
calculus will be illustrated with examples from the compositional modelling domain. 

2.2.1 Representation OF OMPs 

Technically, OMPs are combinations of so-called basic preference quantities (BPQs), which are the 
primitive units of preference or utility valuation associated with possible design decisions. Because 
it is often difficult to evaluate these BPQs numerically, they are ordered relative to one another em- 
ploying similar ordering relations as those employed by relative order-of-magnitude calculi (Dague, 
1993a, 1993b). 

Let B be the set of all BPQs with respect to a particular decision problem. The BPQs in B are 
ordered with respect to one another at two levels of granularity, by two relations ^ and <. First, B 
is partitioned into orders of magnitude, which are ordered by <C. Then, the BPQs within each order 
of magnitude are ordered by <. Formally, an order-of-magnitude ordering over BPQs B is a tuple 
(O, <^), where O = {Oi, . . . , Oq} is a partition of B and <^ is an irreflexive and transitive binary 
relation over O. Any subset of BPQs O G O is said to be an order of magnitude in B. Similarly, a 
within-magnitude ordering over a set of BPQs is a tuple (O, <), where O is an order of magnitude 
in B and < is an irreflexive and transitive binary relation over O. 

To illustrate these ideas, consider the problem of constructing an ecological model describing a 
scenario containing a number of populations. Let some of the populations be parasites and others 
be hosts for these parasites. Also, assume that certain populations compete with others for scarce 
resources. In order to construct a scenario model, the compositional modeller must make a number 

503 



Keppens & Shen 



6ir,: Lotka-Volterra 
predation model 


< 


613: Holllng 
predation model 




\ 
\ 
\ 
\ 
\< 
\ 
\ 
\ 
< \ 


\. 






\ 


611: Thomson's 
host-parasitoid model 


612: Nicholson-Bailey's 
host-parasitoid model 


/< 








hii. Roger's 
host-parasitoid model 



0\ (host-parasitoid plienomenon) 



6:11: competition 
phenomenon 



O1 (population growth phenomena) O3: (competition phenomenon) 

Figure 1: Sample space of BPQs B 

of model design decisions: which population growth, host-parasitoid and competition phenomena 
are relevant, and which types of model best describe these phenomena. 

Figure 1 shows a sample space of BPQs that correspond to the selection of types of model. For 
the sake of illustration, the presumption is made that the quality of a scenario model depends on the 
inclusion of types of model, rather than on the inclusion or exclusion of phenomena. Apart from 
623 and 631, all BPQs correspond to standard textbook ecological models^ BPQ 623 stands for the 
use of a population growth model that is implicit in another population growth model (the Lotka- 
Volterra model, for instance, implicitly includes its own concept of growth). Finally, BPQ ^31 is the 
preference associated with a competition model (say, the only one included in the knowledge base). 

The 9 BPQs in this sample space are partitioned over 3 orders of magnitude. The ^ relation 
orders the orders of magnitude: O2 ^ 0\ and O2 ^ O3. The binary < relation orders indi- 
vidual BPQs within an order of magnitude. In the BPQ ordering within 0\, for instance, Rogers' 
host-parasitoid model (^n) is preferred over that by Nicholson and Bailey (612) and the Holling 
predation model (613). The latter two models can not be compared with one another, but they both 
are preferred over the Lotka-Volterra model. Furthermore, Thompson's host-parasitoid model is 
less preferred than that of Nicholson and Bailey, but it can not be compared with the Lotka-Volterra 
and Holling models. 

2.2.2 Combinations of OMPs 

By definition, OMPs are combinations of BPQs. The implicit value of an OMP p equals the com- 
bination h\® . . .®hn of its constituent BPQs 61, . . . , 6„. This property allows OMPs to be defined 
as functions such that an OMP P = 61 © . . . © 6„ is a function /p : B 1-^ IN : 6 ^ /p(6) where B 



I. To be precise, the BPQs h\\, h\2, 613, bi4, 615, 621 and &22 respectively correspond to the inclusion of Rogers' 
host-parasitoid model (1972), the host-parasitoid model by Nicholson and Bailey (1935), Rolling's predation model 
(1959), Thompson's host-parasitoid model (1929), the predation model by Lotka and Volterra (1925, 1926), a logistic 
population growth model (Verhulst, 1838) and an exponential population growth model (Malthus, 1798). 
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is the set of BPQs, IN is the set of natural numbers and fp{b) equals the number of occurrences of b 
mbi,...,bn. 

For example, let Pmodei denote the OMP associated with the scenario model that contains three 
logistic population growth models (621), two HoUing predation model (613) and one competition 
model (631). Therefore, 

-Pmodel = b21 e 621 e 621 © 613 © ^13 © &31 

and hence: 



•f ^model V / 



3 if 6 = 621 

2 if 6 = 613 

1 if 6 = 631 

otherwise 



By describing OMPs as functions, the concept of combinations of OMPs becomes clear. For 
two OMPs Pi and P2, the combined preference Pi © P2 is defined as: 

MeP2 : B ^ IN : 6 ^ fp.^p, (b) = fp, (b) + fp, (6) 

Note that the combination operator © is assumed to be commutative, associative and strictly mono- 
tonic (P ^ P (B P)- The latter assumption is made to better reflect the ideas underpinning conven- 
tional utility calculi (Dinger & Hoffman, 1998). 

2.2.3 Partial ORDERING OF OMPs 

Based on the combinations of OMPs, a partial order =^ over the OMPs can be computed by exploit- 
ing the constituent BPQs of the OMPs considered. This partial order implies that a comparison of 
any pair of OMPs either returns equal preference (=), smaller preference {■<), greater preference 
{>-) or incomparable preference (?). This calculus is developed assuming the following: 

• Prioritisation: A combination of BPQs is never an order of magnitude greater than its con- 
stituent BPQs. That is, given the set of BPQs belonging to the same order of magnitude 
{61, b2, . . ■ , bn} C Oi and a BPQ b £ O2 belonging to a higher order of magnitude, i.e. 
Oi < O2, then 

61 © 62 © • • • © 6n ^ fc 

With respect to the ongoing example, this means that any BPQ taken from the order of magni- 
tude Oi is preferred over any combination of BPQs taken from 02- In other words, the choice 
of a model to describe a host-parasitoid phenomenon is considered more important than the 
choice of population growth model (see Figure 1). 

Prioritisation also means that distinctions at higher orders of magnitude are considered to 
be more significant than those at lower orders of magnitude. Consider a number of BPQs 
61, ... , bm-i,bm, ■ ■ ■ ,bn taken from one order of magnitude Oi and a pair of BPQs {b, b'} 
taken from an order of magnitude that is higher than Oi. If 6 < b', then (irrespective of the 
ordering of the BPQs taken from Oi) 



'Jm-l 



b<bm® ---^bn^b' 
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• Strict monotonicity: Even though distinctions at higher orders of magnitude are more signif- 
icant, distinctions at lower orders of magnitude are not negligible. That is, given an OMP 
P and two BPQs 61 and 62 taken from the same order of magnitude with 61 < 62, then 
(irrespective of the orders of magnitude of the BPQs that constitute P) 

bi®P<b2®P 

For instance, the preference ordering depicted in Figure 1 shows that a scenario model with a 
Roger's host-parasitoid model and two logistic predation models is preferred over one with a 
Roger's host-parasitoid model and two exponential predation models: 

611 e 622 © ^22 -< &11 © ^21 © &21 

Note that this is a departure from conventional order-of-magnitude reasoning. If the OMPs 
associated with two (partial) outcomes contain equal BPQs at a higher order of magnitude, it is 
usually desirable to compare both solutions further in terms of the (less important) constituent 
BPQs at lower orders of magnitude, as the example illustrated. However, conventional order- 
of-magnitude reasoning techniques can not handle this. 

• Partial ordering maintenance: Conventional order-of-magnitude reasoning is motivated by 
the need for abstract descriptions of real-world behaviour, whereas the OMP calculus is mo- 
tivated by incomplete knowledge for decision making. As opposed to conventional order- 
of-magnitude reasoning and real numbers, OMPs are not necessarily totally ordered. This 
implies that, when the user states, for example, that 61 < 62 < ^ and that 63 < 64 < 6, the 
explicit absence of ordering information between the BPQs in {61, 62} and those in {^3, 64} 
means that the user is unable to compare them (e.g. because they are entirely different things). 
Consequently, 61 © 62 would be deemed incomparable to 63 © 64 (i.e. 5i © 62?63 © 64), rather 
than roughly equivalent. 

From the above, it can be derived that given two OMPs Pi and P2 and an order of magnitude O, 
Pi is less or equally preferred to P2 with respect to the order of magnitude O (denoted Pi ^o P2) 
provided that 

bj€0,h<b:i bjeO,h<bj 

Thus, comparing two OMPs within an order of magnitude can yield four possible results: 

• Pi is less preferred than P2 with respect to O (Pi ^o P2) iff {Pi 4o P2) A -'(P2 =^ Pi), 

• Pi is more preferred than P2 with respect to O {Pi yo P2) iff "'(-Pi 4o P2) A (P2 =?; Pi), 

• Pi is equally preferred than P2 with respect to O {Pi =0 P2) iff (Pi 4o P2) A (P2 ^ Pi), 
and 

• Pi is incomparable to P2 with respect to O (Pi?oi^2) iff "'(-Pi <o P2) A ^(P2 ^ Pi). 
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In the ongoing example of Figure 1, for instance, the preference of a scenario model with a 
Roger's host-parasitoid model and a Rolling predation model is Pi = bu © &13 and the preference 
of a scenario model with a Roger's host-parasitoid model and a Lotka-Volterra predation model 
is P2 = 611 © 615. The latter model is less than or equally preferred to the former within the 
"host-parasitoid" order of magnitude (Oi), i.e. P2 ^Oi A, because 

M(&ii) = i<i = M(foii), 
M(&ii) © M(M = 1 < 1 = M(6ii) © M(6i2), 

M(6ii) © fp.ihs) = 1 < 2 = fp,{bu) © M(6i3), 

fp.ibn) © fp,{bi2) © fpAbu) = 1 < 1 = M(feii) © M(6i2) © M(6i4), 

fp^ihi) © /P2(foi2) © /ft (613) © /P2(&14) = 2 < 2 = /p,(6n) © /p,(6i2) © ^(613) © fpAbu)- 

Similarly, it can be established that the reverse, i.e. Pi ^Oi P2, is not true. Therefore, the latter 
scenario model is less preferred than the former within Oi, i.e. P2 -<Oi ^i. 

The above result can be further generalised such that given two OMPs Pi and P2, Pi is less or 
equally preferred to P2 (denoted Pi ^ P2) if 

VO, G O, (Pi =^0, P2) V (30, € O, O, « O, A Pi ^o, i^2) 

More generally, the relations ^, ^, = and ? can be derived in the same manner as with the 
relation =<( where -<o, ^o» =0 and ?o with ^o- 

To illustrate the utility of such orderings, consider the scenario of one predator population that 
feeds on two prey populations while the two prey populations compete for scarce resources. The 
following are two plausible scenario models for this scenario: 

• Model 1 contains two Rolling predation models and three logistic population growth models, 
and has preference Pi = 613 © ^13 © 621 © &21 © &2i- 

• Model 2 contains one competition model, two Rolling predation models, two logistic pop- 
ulation growth models and an exponential population growth model, and has preference 

-P2 = biz © ^13 © ^21 © ^21 © &22 © b'ii. 

As demonstrated earlier, it can be shown that Pi =0^ P2, Pi >-02 ^2, and Pi ^03 P2. From these 
relations it follows that Pi =<; P2 because 

• for Oi: Pi 4oi P2 since Pi =Oi P2, 

• for O2: there exists an order of magnitude O3 where O3 » O2 and Pi ^o^ P2, 

• for O3: Pi ^03 P2 since Pi -<03 P2. 

As the reverse is not true, it can be concluded that scenario model 2 is preferred over scenario model 
1. 

2.3 Solving aDPCSPs 

This section presents a basic algorithm for solving aDPCSPs. Although OMPs are used in this 
work, this algorithm can take any aDPCSP provided that it employs a preference calculus with a 
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commutative, associative and monotonic combination operator. However, the use of OMPs provides 
a convenient way of specifying incomplete preference information. 

An aDPCSP is similar to valued CSPs as presented by Schiex, Fargier and Verfaillie (1995) 
and also to semiring based CSPs (Bistarelli, Montanari, & Rossi, 1997). However, it extends both 
approaches with activity constraints and involves different underlying presumptions in its valuation 
structure. The preference valuations in this work are allowed to be ordered partially, as opposed to 
the valued CSPs. 

An aDPCSP represents an important type of constraint satisfaction optimisation problem (Tsang, 
1993). In order to tackle the optimisation of preferences an A* type algorithm is employed (Hart, 
Nilsson, & Raphael, 1968; Raphael, 1990). A* algorithms are known to be efficient in terms of 
the total number of nodes explored in an effort to find optimal solutions, with a given amount of 
information. On the downside, they have an exponential space complexity. Naturally, a number of 
alternative approaches could have been explored, including conventional constraint-based solving 
methods such as depth first branch and bound search. However, the use of an A*-like algorithm is 
sufficient for solving the aDPCSPs in the domain of the present interest. In particular, algorithm 1 
implements an A* search strategy that is capable of handling activity constraints, which involves 
the use of basic CSP techniques such as constraint propagation and backtracking. 

An A* algorithm maintains the explored attribute-value assignments by means of a priority 
queue Q of nodes. Each node nmQ corresponds to a set of attribute- value assignments: solution (n). 
The search proceeds through a number of iterations. At each iteration, a node n is taken from Q, 
and replaced by nodes that extend solution (n) with an additional attribute- value assignment. More 
specifically, for each node n in Q, n set X„(n) of remaining active but unassigned attributes is 
maintained. At each iteration, the possible assignments of the first attribute x € X„(n), where 
n is the node taken from Q at the current iteration, are processed. For every assignment x : d 
that is consistent with solution(n) (i.e. solution(n) U {x : d}, C F _L), a new child node n', with 
solution(n') = solution(n) U {x : d} and X„(n') = X„(n) — {x], is created and added to Q. 

The activity constraints are processed via propagation rather than constraint satisfaction. When- 
ever a node n is taken from Q such that Xu {n) is empty, the activity constraints are fired in order to 
obtain a new set of active but unassigned attributes. That is, X„(n) is assigned 

{xi j solution(n), A h active(xj)} — Xa{n) 

where Xa{n) represents the active, but already assigned attributes in node n. 

In the priority queue Q, nodes are maintained by means of two heuristics: committed preference 
CP{n) and potential preference PP{n). Here, given a node n, 

CP{n) = ©a;:dgsolution(n)-P(2^ • ") 

PP{n) = CP{n) © (©^gx„,(n) maxP(x : d)) 

where Xndin) is the set of unassigned attributes that can still be activated given the partial assign- 
ment solution(n) (as indicated previously, the actual implementation employs an assumption-based 
truth maintenance system (de Kleer, 1986) to efficiently determine which attribute's activity can no 
longer be supported). In other words, CP{n) is the preference associated with the partial attribute- 
value assignment in node n and PP{n) is CP{n) combined with the highest possible preference 
assignments taken from all the values of the domains of those attributes in X„rf(n). Thus, PP{n) 
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then 



Algorithm 1: solve(X,D, C, A, P) 

n <— new node; 
solution(n) ^ {}; 

Xu{n) ^ {xi I {}, A h active(a;i)}; 
Xa{n) ^ {}; 
CP{n) <- 0; 

PP{n) ^ ea:exmaxdgD(a:) P{x ■■ d); 
Q ^- createOrderedQueue(); 
enqueue(Q, n, PP{n), CP(n)); while Q / 
'n +— dequeue((5); 
lfX„(n)/0 

{x ^ first(X„(n)); 
PROCESS(a;, n, C, A, P, Q); 
' Xu{n) ^- {xi I solution(n), A h active(a;i)} — Xa{n); 
lfX„(n) = 
do •( f n„ext ^ first((5); 

ifCP(n)7«PP(nfi„,) 
else ^ then i then return (solution(n)); 
rPP(n)<-CP(n); 
\enqueue(Q, n, PPin), CP{n)); 
\x ^ first(Xu(n)); 
[PROCESS(a::, n, C, A, P, Q); 
procedure process (a;, nparem, C, A, P, Q) 
forrfG D(a;) 

' if solution(nparcnt) U {x : d}, C J^ ± 
' Wchiid ^- new node; 

solution (nchiid) ^- solution (nparcnt) U {a:: : d}; 
Xd <— deactivated(solution(nchiM),-'f(npari;ni)); 

^nd(Wcliild) *— ^nd (n-parcnt) — {x} — Xd\ 

then •^ Xa(nchiid) ^ -'fo(npari;ni) U {a;}; 

-'fu(nchild) <— -'fu(nparent) — {a^}; 

CP(nd„id) <- CP(npare„,) P(a: : d); 

PP(nch,id) ^ CP(nchiid) ffi ®xex„^(n) raaj^deD{x) P{x : d); 
^enqueue(Q, ncwid, PP(nchiid), CP(nchiid)); 



else 



else 



do < 



computes an upper boundary on the preference of an aDPCSP solution that includes the partial 
attribute-value assignments corresponding to n. 

The following theorem shows that algorithm 1 is guaranteed to find the set of attribute-value 
pairs with the highest combined preferences, within the set of solutions that satisfy the constraints. 

Theorem 1 SOLVe(X, D, C, A, P) is admissible 

Proof; SOLYe(X.,Y), C, A, P) is an A* algorithm guided by a heuristic function PP{n) = CP{n)(B 
h{n), where CP{n) is the actual preference of node n and h{n) = ©xex„d(n) '^^'^d&Dx P{x '■ d). 
It follows from the previous discussion that h{n) is greater than or equal to the combined preference 
of any value-assignment of unassigned attributes that is consistent with the partial solution ofn. In 
this algorithm, the nodes n are maintained in apriority queue in descending order of PP{n). Let 6 
be a distance function that reverses the preference ordering such that 5{Pi) -< S{P2) *-^ Pi >- P2- 
S0LVE(X, D, C, a, P) can then be described as an A* algorithm, where the nodes n in the priority 
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queue Q are ordered in ascending order of5{PP{n)), such that 6{PP{n)) = 5{CP{n)) © 6{h{n)) 
and 5{h{n)) is a lower bound on the distance between n and the optimal solution. Therefore, fol- 
lowing the work by Hart, Nilsson and Raphael (1968), SOLVe(X, D, C, A, P) is an admissible 
algorithm, guaranteed to find a solution S with a minimal 5{P{S)) or a maximal P{S). 

To illustrate algorithm 1, consider the problem of finding an ecological model that describes the 
behaviour of two populations, one of which predates on the other. An aDPCSP is constructed for 
the compositional modelling problem with the following attributes and domains. Note that section 
3 demonstrates how the attributes, domains and constraints of this problem can be constructed 
automatically and that section 4 illustrates these ideas in the context of a larger example. 



X = {xi,X2,X3,X4,X5,X6} 

D^^ = {yes, no} 

Da:,, = {yes, no} 

Dx3 = {yes, no} 

Dx4 = {other, logistic} 

Dxr^ = {other, logistic} 

Dxg = {Holling,Lotka-Volterra} 

The attributes xi, X2 and X3 respectvely describe the relevance of the following phenomena: 
the change in size of the predator population, the change in size of the prey population and the 
predation of the prey by the predator. The attributes X4 and X5 represent the choice of type of 
population growth model. Two types of such models are incorporated in the problem: the logistic 
one and the "other". Finally, attribute xq is associated with the choice of model type of the predation 
phenomenon. Here, two types of model, the Rolling model and the Lotka-Volterra model, are 
included. 

Because the Rolling predation model presumes that logistic models are employed to describe 
population growth, and because the Lotka-Volterra Model incorporates its own population growth 
model, the combinations of assignments to X4, X5, and xq are restricted. Rence, the aDPCSP 
contains a set C = {cj^^^ ,cg}, Cj^g .jg}} of compatibility constraints, with: 

'^{x4,x(i} — {(^4 • Other, xg : Lotka-Volterra), [x^ : logistic, xe : Rolling)} 
'^{x5,x(i} = {{^5 ■ other, xg : Lotka-Volterra), (X5 : logistic, xg : Rolling)} 

Furthermore, a model type of predator/prey growth must be selected if and only if the cor- 
responding population growth phenomenon is deemed relevant. Also, a model type of preda- 
tion must be selected if and only if both population growth phenomena and the predation phe- 
nomenon are deemed relevant (because ecological models describing predation rely on submodels 
describing population growth of the predator and the prey). Rence, the aDPCSP contains a set 
A = {ax4,{a;i}, «x5,{:r2}) aa;6,{xi,x2,x3}} of activity constraints, with: 
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0^4, {xi} — {{^1 
C'XQ,{xi,X2,X3} IX-^l 

{xi 
{xi 

{Xl 



yes,active(x4)), (xi : no, -iactive(x4))} 

yes,active(x5)), {x2 : no, -iactive(x5))} 

yes, X2 : yes, X3 : yes, active(x4)), (xi : yes, X2 : yes, X3 : no, ^active(x4)), 

yes, X2 : no, X3 : yes, -iactive(x4)), (xi : yes, X2 : no, X3 : no, ^active(x4)), 

no, X2 : yes, X3 : yes, -iactive(x4)), (xi : no, X2 : yes, X3 : no, ^active(x4)), 

no, X2 : no, X3 : yes, -iactive(x4)), {^i : no, X2 : no, X3 : no, -iactive(x4))} 



Finally, let the preference calculus consist of two orders of magnitude Ogrowth and Opredation> 

with Ogrowth < Opredation, whcrC 

Ogrowth =|Pother)Plogistic| With piogistic < Pother 
Opredation = jPHolling ) PLotka-Volterra } With PLotka-Voherra < PHoUing 

The OMP assignments are as follows: 

P(x4 : other) = P(x5 : other) =Pother 
P{x4 : logistic) = P(x5 : logistic) =piogistic 
P(x6 : Rolling) =pHoiiing 

P(X6 : Lotka-Volterra) =PLotka-Volterra 

When applied to this problem, algorithm 1 initialises the search by creating a node tiq, where: 

• X„(no), the set of currently active attributes, is initialised to {xi, X2, X3}, because the activity 
of these attributes does not depend on other attribute- value assignments. 

• Xa{no) and CP{no) are initialised to the empty set and to respectively, since no attributes 
have been assigned yet. 

• Finally, PP(no) equals Pother © Pother © PHoihng because this is the combination of highest 
OMPs associated with each domain. 

This initial node is enqueued in Q. Next, the algorithm proceeds through a number of iterations. 
At each iteration, the node with most potential (as measured by PP and CP) is dequeued, and its 
children are generated and enqueued in Q. The nodes that are created in this way are depicted in 
Figure 2. The number i in the subscript of each node rii indicates the order of node generation, and 
the thick arrows show the order in which the search space is explored. 

Note that there are three important features of the algorithm that could not be clearly demon- 
strated within Figure 2. Firstly, at node n^, the initial set of unassigned attributes is exhausted: 
Xui^^) = {}■ Therefore, the activity constraints are fired when n^ is explored. Because n^ corre- 
sponds to the assignment {xi : yes, X2 : yes, X3 : yes}, the remaining attributes are activated and 
Xuin^) is reset to {x4, X5, xe}. 

Secondly, node ni2 corresponds to an assignment of all (active) attributes that is consistent with 
the activity and compatibility constraints: 

{xi : yes, X2 : yes, X3 : yes, X4 : other, X5 : other, xq : Lotka-Volterra} 
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Xi 


"1^/ yes 

-P-P = Pother ® Pother ffi PHohing 

CP = 


PP = Pother 
CP = 


no 



J 



X2 


n^y^ yes 

PP = Pother ffi Pother ffi PHolUiig 

CP = 


PP = Pother 

CP = 


no 



I 



"o^/ yes 

PP — Pother © Pother © PHolling 

CP = 



I 




PP = Pother ffl Pother 
CP = 



14 


777 ^ 
PP = 

CP = 


^ other 

- Pother ® Pother © PHoUing 
= Pother 




n-ay^ logistic 

PP = Plogiitic ® Pother ffi PHohing 
CP=P,ogi„,e 



(N 



"9^^ otiier 

PP = Pother ffl Pother ffl PHoUlng 
^ P — Pother ® Pother 




logistic 



PP = Pother ffi Plogistio ffl PHohing 
CP = Pother ffi Plogi; 




Holling 

inconsistent 



PP — Pother © Pother ^3 PLotlta-VolterTa 
^ P — Pother ® Pother ® PLotita-Volterra 



Holling 
inconsistent 



!^otka-Volterra 
inconsistent 




PP = Plogiaic ffi Plogistio ffi PHolling 
CP = Plogistio ® Plogiitic ffi PHolhng 



!^otka-Volterra 
inconsistent 



Figure 2: Search space explored by algorithm 1 when solving sample aDPCSP 



Compositional Model Repositories 



This assignment is not a solution to the aDPCSP, because the corresponding preference is not guar- 
anteed to be maximal (and, the assignment is, in fact, not optimal). After the creation of ni2, the pri- 
ority queue Q looks as follows (the ordering between n2 and 714 may vary since PP{n2) = PP{n4) 
and CP{n2) = CPin^)): 

{nio,n8,ni2,n6,n2,n4} 

Therefore, the next node to be explored (after ng and the subsequent creation of ni2) is nio- 
Thirdly, node nig does correspond with an optimal solution. After its creation, Q equals: 

{"-19, "-12,^6, 77-2, n4} 

As a consequence, nig is dequeued in the next iteration. Because no children of nig can be created 
(Xu(nig) = and the activity constraints activate no more attributes), nig is retained as a solution. 
If the user is interested in finding multiple alternative solutions, the search may proceed until 
Q contains no more nodes with a PP value that is not smaller than the maximum preference of 
the first solution. In this case, PP{ni2) ■< CP{nig) and hence, there is only one solution to this 
aDPCSP. 

3. Compositional Model Repositories 

The aDPCSPs discussed in the previous section provide the foundation for the development of the 
compositional model repositories. This section specifies the problem that a compositional model 
repository is built to solve and shows how it can be translated into an aDPCSP, and hence be resolved 
using the proposed aDPCSP solution algorithm. 

3.1 Background: assumption based truth maintenance 

An ATMS is a mechanism that keeps track of how each piece of inferred information depends 
on presumed information and facts and of how inconsistencies arise. In an ATMS, each piece of 
information used or derived by the problem solver is stored as a node. Certain pieces of information 
are not known to be true and cannot be inferred from other pieces of information, yet plausible 
inference may be drawn from them. Such nodes are categorised by a special type and referred to as 
assumptions. 

Inferences between pieces of information are maintained within the ATMS as dependencies be- 
tween the corresponding nodes. In its extended form (see de Kleer, 1988; or Keppens, 2002), the 
ATMS can take inferences, called justifications of the form rii A . . . A Uj A -in^ A ... A ^ni -^ rim, 
where Ui, . . . ,nj,ni., . . . ,ni, Um are nodes that the problem solver is interested in. An ATMS 
can also take a specific type of justification, called nogood, that leads to an inconsistency, of the 
form Ui A . . . A Uj A -'nj^ A ... A -m; -^ _L (meaning that at least one of the statements in 
{rii, . . . , Uj, -infc, . . . , -'Ui} must be false). In the ATMS, these nogoods are represented as jus- 
tifications of a special node, called the nogood node. 

Based on the given justifications and nogoods, the ATMS computes a label for each (non- 
assumption) node. A label is a set of environments and an environment is a set of assumptions. 
In particular, an environment E depicts a possible world where all the assumptions in E are true. 
Thus, the label C{n) of a node n describes all possible worlds in which n can be true. The label 
computation algorithm of the ATMS guarantees that each label is: 
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• Sound - All assumptions in any environment within the label of a node being true is a sufficient 
condition to derive that node: 

V£' G C{n), [{An,€Eni) A {A^n.eE^rii)] h n 

• Consistent - No environment in the label of a node, other than the nogood node, describes an 
impossible world: 

yE G C{n), [{An.eErii) A (A^n^eE^rii)] F ± 

• Minimal - The label does not contain possible worlds that are less general than one of the 
other possible worlds it contains (i.e. environments that are supersets of other environments 
in the label): 

yE G C{n)$E' G C{n), E' C E 

• Complete - The label of each node, other than the nogood node, describes all possible worlds 
in which that node can be inferred: 

yE,[{/\n,€Eni) A {A^n,€E^ni) h n] 

3E' G C{n), [{An^^E'iT-i) A {A^rneE'^ni) h n] 

3.2 Knowledge Representation 

As with any other knowledge-based approach, building a compositional modeller requires a formal- 
ism for the specification of its inputs, its outputs and its knowledge base. The work developed here 
is loosely based on the compositional modelling language (Bobrow, Falkenhainer, Farquhar, Pikes, 
Forbus, Gruber, Iwasaki, & Kuipers, 1996), a proposed standard knowledge representation formal- 
ism for compositional modellers, but adapted to meet the challenges of the ecological compositional 
modelling problems identified in the introduction. 

3.2.1 Preliminary CONCEPTS 

The most primitive constructs in a compositional modeller are participants, relations and assump- 
tions. This subsection summarises these concepts and explains how they are represented herein. 

Participants'^ refer to the objects of interest, which are involved in the scenario or its model. 
These participants may be real-world objects or conceptual objects, such as variables that express 
features of real-world objects in a mathematical model. For instance, a population of a species is 
a typical example of a real-world object, and a variable that expresses the number of individuals 
of this species forms an example of a conceptual object. It is natural to group objects that share 
something in common into classes. Participants are herein grouped into participant classes, with 
each representing a set of participants that share certain common features. Each class will be given 
a name for easy reference. 

Relations describe how the participants are related to one another. As with participants, some 
relations represent a real-world relationship, such as: 



2. Some of the previous work in compositional modelling refers to these as individuals and quantities, but such names 
would not suit the present application. Ecological models typically describe the behaviour of populations rather than 
that of individuals and it is often hard to distinguish between quantities. 
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predation(frog, insect) (1) 

Other relations may be conceptual in nature, such as equation (2), which describes an important 
textbook model of logistic population growth (Ford, 1999): 

d , ■ /-, size , 

--change = parameter x size x (1 -. — ) (2) 

at capacity 

To be consistent with other compositional modelling approaches, this paper employs a LISP- 
style notation for relations. As such, the above two sample relations become: 



(predation frog insect) (1) 

(d/dt change (* change-rate size {- 1 {/ size capacity) ) ) ) (2) 



Assumptions form a special type of relation that are employed to distinguish between alternative 
model design decisions. Internally, assumptions will be stored in the form of assumption nodes in 
the ATMS (see section 3.3.1), but in the knowledge base, assumptions appear as relations with a 
specific syntax and semantics. 

Two types of assumptions are employed in this article. Relevance assumptions state what phe- 
nomena are to be included in or excluded from the scenario model. Typical examples of phenomena 
are the population growth and predation phenomena. The general format of a relevance assumption 
is shown in (3). The phenomenon that is incorporated in the scenario model when describing a rele- 
vance assumption is identified by (name) and is specific to the subsequent participants or relations. 
For example, relevance assumption (4) states that the growth of participant ?population is to be 
included in the model. 



(relevant {name) [{(participant)} \ (relation)]) (3) 

(relevant growth ?population) (4) 



Model assumptions specify which type of model is utilised to describe the behaviour of a certain 
participant or relation. Typical examples of model types include the exponential (Malthus, 1798) 
and the logistic (Verhulst, 1838) model types of population growth. The formal specification of a 
model assumption is given in (5). Often the (name) in (5) corresponds to the name of a known 
(partial) model of the phenomenon or process being described. The example in (6) states that the 
population ?population is being modelled using the logistic approach. 

(model [(participant) | (relation)] (name)) (5) 

(model ?population logistic) (6) 
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mortality-rate 
capacity 
search-rate 
prey-tiandling-time 



Figure 3: Stock flow diagram of predator prey scenario model 

3.2.2 Scenarios and scenario models 

As formalised by Keppens and Shen (2001b), a compositional modeller takes two inputs and pro- 
duces one output. The first input is a representation (which is itself a model) that describes the 
system of interest by means of an accessible formalism. This model, which normally consists of 
(mainly) real-world participants and their interrelationships, is called the scenario. The second input 
is the task description. It is a formal description of the criteria by which the adequacy of the output 
is evaluated. The output is a new model that describes the scenario in a more detailed formalism, 
usually a set of variables and equations, which the model-based reasoner can employ readily. Such a 
model, which normally contains conceptual participants and interrelationships, is called a scenario 
model. The aim of any compositional modeller is to translate the scenario into a scenario model, by 
means of the task description. 

In this work, a model is formally defined by a tuple (P, R), where P is a set of participants and 
i? is a set of relations over the participants in P. This definition applies to both the scenario and the 
scenario model. A typical example of a scenario is a description of a predator population, a prey 
population and a predation relation between the predator and the prey. This scenario is a model 
(P, R) with: 

P = {predator, prey} 

i? ={ (predation predator prey)} 

The aim of the compositional model repository is to translate a scenario into a scenario model. 
Within this work, both systems dynamics stock-flow formalism (Forrester, 1968) and ordinary dif- 
ferential equations (ODEs) will be employed as the modelling formalisms. For example, a scenario 
model that corresponds to the above scenario is depicted in Figure 3. Formally, a scenario model is 
another model (P, R) and in this case 

-* ^ {-'''predator) J^\>mdaXor-i -'-'predator) -''prey; -t'prey) -L^prey) -fprey) 
"predator) "preyj '^predator) Opreyj C/ predator) ^prey) 
''(prey, predator) ) ''(prey, predator) ) ' (predator, prey) J 
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Symbol 


Variable name 


-' 'predator? -' 'prey 


number of predators, prey 


-Dpredator? -Dprey 


natality of predators, prey 


-L^predator? i^prey 


mortality of predators, prey 


^prey 


predation of prey 


Opredator? Oprey 


natality-rate of predators, prey 


"predator? Ctprey 


mortality -rate of predators, prey 


L^ predator? ^prey 


capacity of predators, prey 


'S(prey,predator) 


search-rate 


'-(prey.predator) 


prey-handling-time 


^(predator,prey) 


prey-requirement 



Table 1 : Variables in the stock flow diagram and the mathematical model 



_ r ^ 

J^ — 1 "77-' 'predator — ^predator -i^predatorj 

d 
"77 -''prey — -Dprey -^prey -fprey? 

-CJpredator — ''pfedator ^ -'^ predator? 

-Oprey — Opj-gy X iVpfey? 

T-j r j,j -'''predator 

-^predator — Q^predator X -ly piedaiojc X — :; , 

^predator 

n _ ^ X /v X ^p'^y 

-"-^prey — "prey ^ -< v pjey ^ „ ? 

L^prey 



^prey — "prey ^^ ^'prey ^^ ^ ? 
L^prey 

-, •* (prey, predator) ^ -'''prey X iVj 

prey ^ T~ TT 7 

~r '5{prey,predator) ^ -< 'prey X t(-p 



predator 
'{prey, predator) ^ -iVj 
(-^predator = ''(predator,prey) ^ -'^I 
*-^prey — -'^preyj 



'prey X t(prey,predator) 
'prey? 



The relation between the variables of the mathematical model and those used in the stock-flow dia- 
gram is given in table 1. Generally speaking, stock-flow diagrams are graphical representations of 
systems of (ordinary or qualitative) differential equations. In the automated modelling literature in 
general, and engineering and physical systems modelling in particular, more sophisticated represen- 
tational formalisms have been developed to enable the identification of mathematical models of the 
behaviour of dynamic systems from observations. Examples include bond graphs (Karnopp, Mar- 
golis, & Rosenberg, 1990) and generalised physical networks (Easley & Bradley, 1999). However, 
the potential benefits of these more advanced formalisms are not exploited here, but remain as an 
interesting future work. Instead, stock-flow diagrams are employed throughout this paper as they 
are far more commonly used in ecological modelling (Ford, 1999). 

It is often possible to construct multiple scenario models from a single given scenario, and the 
task specification is employed to guide the search for the most appropriate one(s). In this work, 
scenario models are selected on the basis of hard constraints and user preferences. The hard con- 
straints stem from restrictions imposed on compositionality by the representational framework (see 
section 3.2.3) and from properties the scenario model is required to satisfy (see section 3.2.3). The 
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Name 


Syntax (infix notation) Syntax (prefix notation) 


Addition 


?var = C+ (formula) (== ?var (C-add formula)) 
?var = C^ (formula) (== ?var (C-sub formula)) 


Multiplication 


?var= C^ (formula) (== ?var (C-mul formula)) 
?var = C^ (formula) (== ?var (C-div formula)) 


Selection 


?var = C"'"'' (antecedent, formula) (== ?var (C-if antecedent formula : priority p)) 
?var = C*° (formula) (== ?var (C-else formula) 



Table 2: Composable functors and composable relations 

user preferences express the user's subjective view as to which modelling approaches are more 
appropriate in the context of the current scenario (see section 2.2). 

3.2.3 The knowledge base 

To construct scenario models from a given scenario, a compositional modeller relies on the use 
of a knowledge base that is particular to the problem domain. To illustrate the ideas, this section 
presents the constructs employed in the compositional modeller that is developed to synthesise 
scenario models in the ecological domain. 

Composable relations The knowledge base in this approach consists of partial models that can be 
instantiated and composed into more complex scenario models. The composition of partial models 
into a scenario model may involve the composition of partial relations (coming from different partial 
models) in compounded relations. In the sample scenario model of section 3.2.2, the following 
relation describes the changes of population size of the prey population 






B, 



prey 



D, 



prey 



Pr 



prey 



(7) 



In (7), A'prey is the population size, i^prey the number of births, Dp^ey the number of natural deaths 
and Pprey the number of prey who died due to predation. Thus, relation (7) actually describes two 
phenomena that affect the population size A/prey : natural population growth (i?prey — Dprey) and 
predation related deaths (Pprey)- When constructing the knowledge base, it is desirable to represent 
these two phenomena in isolation because they do not always occur in combination. For example, 
some species do not have predators, and it is therefore unnecessary to always include predation 
as a cause of death. From this viewpoint, relation (7) can be seen as composed from different 
composable relations in the knowledge base: 



, -'>prey — ^ (-Dprey j rj. prey — ^ (-^preyj 



d 



, -'''prey — L/ (-fpreyj 



The use of composable relations enables the knowledge base to cover as many combinations 
of the phenomena that may affect a relation as possible, by representing each phenomenon indi- 
vidually rather than precompiling everything together. Because only the component parts (i.e. the 
composable relations) of relations need to be represented, instead of all possible, and however com- 
plex, combinations of them, the knowledge base can be smaller and more effective. This section 
describes how such composable relations are represented in the knowledge base, as well as whether 
and how they can be composed to form compounded relations. 



518 



Compositional Model Repositories 



Composable relations are those containing composable functors and for which a method of 
composition exists (that describes how a complete set of composable relations can be composed). 
The composable functors employed are those proposed by Bobrow et al. (1996) with a new addition: 
composable selection. A summary of such composable relations is presented in table 2. 

The composable relations introduced by Bobrow et al. (1996) are easy to understand. The 
formulae fmv = C^{f) and v = C~{f) represent terms (respectively / and — /) of a sum, and 
the formulae f inv = C^ (/) and v = C~ (/) represent factors (respectively / and 4) of a product. 

However, ecological models in use typically contain selection statements which declare that 
one certain equation must be employed when a condition is satisfied and some other one otherwise. 
Formally, a selection is a relation of the form 

if ci then v = ri else if C2 . . . else f = r„ (8) 

where w is a participant, each Ci (with i = 1, . . . , n — 1) is a relation describing a condition statement 
and each rj (with j = 1, . . . , n) is a relation. This selection relation consists of the partial relations: 

if a then v = ri with i = 1, . . . ,n — I 

else V = Vn 

Therefore, a selection relation can be composed from two types of composable relation. The first 
is a composable "if" relation, which has the form v = C^^'^{a, /), where w is a participant, p is an 
element taken from a total order, such as the set of natural numbers IN, which denotes the priority of 
the composable "if" relation in the sequence, and a and / are two given relations. The second type 
of composable relation is a composable "else" relation, which has the form v = C"^''*'^(/eise), where 
/else is a given relation assigned to v if none of the antecedents in the composable "if" relations is 
true. 

To illustrate this notation, the selection relation (8) can be composed from the following com- 
posable relations: 



(jii,Vi 



ci,ri, 



V = C^'^^rn) 

with Pi > ... >Pn~l- 

To combine the composable relations, a number of rules are defined to implement the semantics 
of the representational formalism. In theory, a set of rules can be generated that enables the aggre- 
gation of any set of composable relations. In practice, however, a trade-off must be made between 
flexibility (the ability to combine many different types of composable relation) and comprehensi- 
bility (the use of a set of rules that is easily understood by the knowledge engineer who employs 
composable relations). Thus, the types of composable relations that can be combined has to be 
restricted. 

Table 3 summarises what composable relations can be joined to form compounded relations. 
The principle guiding the construction of this table is to allow only the composition of relations of 
certain types for which a resulting compound relation is intuitively obvious. For example, according 
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C+{f2) 


c- 


(/2) 


C" 


(/2) 


c- 


(/2) 


C7'*'P2(a2 


/2) 


C^'''{f2) 


C+(/i) 




yes 


yes 




no 




no 




no 




no 


c-{fi) 




yes 


yes 




no 




no 




no 




no 


C'ih) 




no 


no 




yes 




yes 




no 




no 


C^ifi) 




no 


no 




yes 




yes 




no 




no 


C'f'Pi(ai, 


h) 


no 


no 




no 




no 




yes 




yes 


C'''PHai,fi) 


no 


no 




no 




no 




no 




yes 


C^'^'ifi) 




no 


no 




no 




no 




yes 




no 



Table 3: Composibility of composable relations 

to Table 3, a composable addition relation x = C^{y) can be combined with a composable sub- 
traction relation x = C~{z) because their combination is clearly x = y — z. However, according to 
Table 3, a composable addition relation x = C^{y) can not be combined with a composable multi- 
plication relation x = C^ (z), because an arbitrary and non-intuitive rule would otherwise have to 
be defined to decide whether the compound relation would be x = y + z or x = yxz. 

The order in which the composable selections must be considered is defined by the priorities 
(or is implicit in the case of C^^^^). Therefore, composable selections can be combined with one 
another provided no two composable "if" relations have the same priority. 

In order to derive the actual rules of composition, the sets of all composable relations with the 
same functor for a given model {P, R) are defined first: 



R{v,C+) = {v = C+ifi) 
R{v,C-) = {v = C-{fi) 
R{v,C'<) = {v = C'<{fi) 
R{v,C-) = {v = C-{fi) 

Riv, C'n = {v = C^'^'ifi) I {v = C'^'ifi)) G R} 



C+(/.)) G R} 
C~{fi)) G R} 
C''{fi))€R} 
C-{fi)) G R} 
C^'''^{a,,fi))eR} 



From this, the rules of composition can be built as given in the expressions (9), (10) and (11). 
They jointly state how a given set of composable relations can be rewritten as a single compound 
relation. Each of these rules contains a complete set of all composable relations in the antecedent. 
In particular, the antecedent of rule (9) contains the set of all composable addition and subtraction 
relations with the same participant v in the left-hand side. 

Similarly, the antecedent rule (10) contains the complete set of composable multiplication rela- 
tions. Finally, the antecedent of rule (1 1) is satisfied for the complete set of composable if and else 
relations with the same left-hand participant v, provided that the priorities are strictly ordered (i.e. 
no two priorities are equal) and that there is only a single composable else relation. The latter two 
conditions are added because two composable if relations with the same priority or two composable 
else relations can not be compounded. The consequents of the rules of composition explain how 
these complete sets of composable relations can be joined. This is simply a matter of applying the 
appropriate mathematical operation to the provided terms. 
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R{v, C+) = {v = C+(/i+), ...,v = C+(/„+)}A 

R{v, C-) = {v = C-(/i_), ...,v = C-(/„_)} ^ (9) 

V = fi+ + ... + fm+ - [h- + ■■■ + fn-) 



(10) 



R{v, C-^) = {v = C'^{h^),...,v = C'^{f^^)}^ 
R{v, C-) = {v = C-(/i^), ...,v = C-{fn^)} ^ 

1 X /ix X ... X Jmx 

V = 

/l^ X ... X fn^ 

Riv, C'O ={t' = C'f'^Hai, /i), • • • , « = ^''•^'"(am, /m)}A 

R{V,C'''') ={V = 6"='^^= (/else)} A Pi > . . . > p„ ^ (11) 

V =if oi then /i, else . . . , if a^ then /„,, else /eise 

Property definitions Property definitions describe features of interest to the application requiring 
a scenario model. A property definition IT is a tuple (P*, <I>, vr) where P^ = {pi, . . . pf^} is a set of 
source-participants, a predicate calculus sentence $ whose free variables are elements of P*, and 
TT is a relation, whose free variables are also elements of P"^, such that 



A typical example of a feature of interest is the requirement that a certain variable in the model 
is endogenous or exogenous. To be more specific, the property definitions below describe when a 
variable ?v is endogenous and exogenous respectively. 

(defproperty endogenous 

: source-participants ((?v :type variable)) 

: structural-condition ((or (== ?v *) (d/dt ?v *))) 

:property (endogenous ?v) ) 

(defproperty exogenous 

: source-participants ((?v :type variable)) 

: structural-condition ((not (endogenous ?v) ) ) 

:property (exogenous ?v) ) 

The first definition states that whenever either ?v = *or^?v=*is true (where * matches 
any constant or formula), ?v is deemed to be endogenous. The second property definition indicates 
that a variable is said to be exogenous if such an object exists and it is not endogenous. 

By describing such features formally in the knowledge base, property definitions enable them 
to be imposed as criteria on the selection of scenario models. In this way, the variable describing 
the size of a particular population in an eco-system, for instance, can be forced to be endogenous. 

Note that required properties can be specified in two different ways: either globally as goals for 
the scenario model construction or locally as a required purpose of a certain model fragment. The 
latter use of model properties will be illustrated later. 
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Model fragments Model fragments are the building blocks with which scenario models are con- 
structed. A model fragment // is a tuple (P*,P*, $*,<!>*, A, IT) where P^ = {pf,...p^} is a 
set of variables called source-participants, P* = {p\, ... ,p^} is a set of variables called target- 
participants, <&* = {0f , . . . , 0*} is a set of relations, called structural conditions, whose free vari- 
ables are elements of P'', <I>* = {(j)\, . . . , (p^^} is a. set of relations, called postconditions, whose free 
variables are elements of P"^ U P*, A = {oi, . . . , ay} is a set of relations, called assumptions, and 
n = is a set of relations, called purpose-required properties, such that: 

V(/.* G $*, Vp^, . . . , VK„ 3pi, . . . , 3pi, <^? A . . . A 0: ^ (ai A . . . A aj, ^ (/.*) (12) 

Vvr G U,ypl, . . .,yp^,yp\, . . . , Vp^, (/.^i a . . . a (^^ a ai a . . . a a^ a -vr ^ ± (13) 



Note that, in this work, each property definition (P*, ^l', vr) is equivalent to a model fragment 
(PM},$,{7r}, {},{}). 

For example, the model fragment below states that a population ?p can be described by two 
variables ?p-size (describing the size of ?p) and ?p-change (describing the rate of change in 
population size) and a differential equation 

d 

--?p-size = ?p-chanqe 
dt 

The usage of this partial scenario model is subject to two conditions: (1) the growth phenomenon is 
relevant with regard to ?p, and (2) the variable ?p-change is endogenous in the eventual scenario 
model. The former requirement is indicated by the relevance assumption and the latter by the 
purpose-required property: 

{defModelFragment population-growth 

: source-participants ((?p :type population)) 

: assumptions ((relevant growth ?p) ) 

: target-participants ((?p-size :type variable) 

(?p-change :type variable)) 
:postconditions ( (size-of ?p-size ?p) 

(change-of ?p-change ?p) 
(d/dt ?p-size ?p-change) ) 
:purpose-required ((endogenous ?p-change) ) ) 

The purpose-required property is usually satisfied by additional model fragments, such as the 
one below: 

(defModelFragment logistic-population- growth 
: source-participants ((?p :type population) 

(?p-size :type variable) 
(?p-change :type variable)) 
: structural-conditions ((size-of ?p-size ?p) 

(change-of ?p-births ?p) ) 
:assumptions ((model ?p-size logistic)) 



type parameter) 
type variable) 
type variable) ) 



: target-participants ((?r 
(?k 
(?d 
:postconditions ( (capacity-of ?k ?p) 

(density-of ?d ?p-size) 
(== ?d (C-add (/ ?p-size ?k) ) ) 
(== ?p-change (- (* ?r ?p-size (- 1 ?d) ) ) ) ) ) 
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Model fragments are rules of inference that describe how new knowledge can be derived from 
existing knowledge by committing the emerging model to certain assumptions. They are used 
to generate a space of possible models. Model fragments are instantiated by matching source- 
participants to existing participants in the scenario or an emerging model, and by matching the 
structural conditions to corresponding relations. For each possible instantiation, a new instance is 
generated for each of the target-participants, and where necessary, new instances are also created for 
the postconditions and assumptions. Such instances, as well as the inferential relationships between 
the instances of the source-participants, structural conditions and assumptions on the one hand, and 
those of the target-participants and postconditions on the other, are stored in an ATMS, forming the 
model space. This is to be further explained in section 3.3.1. 

A model fragment is said to be applied if it is instantiated and the underlying assumptions 
hold. If a model fragment is applied, the instances of the target-participants and postconditions 
corresponding to the instantiation of that model fragment must be added to the resulting model. With 
respect to the above example, the model fragment that implements the logistic population growth 
model is instantiated whenever variables exist that describe the size and change in a population, and 
it is applied if the logistic model for population size has also been selected. 

Note that in most compositional modellers, such as the ones devised by Heller and Struss (1998, 
2001); Levy, Iwasaki and Fikes (1997); Nayak and Joskowicz (1996); and Rickel and Porter (1997), 
model fragments represent direct translations of components of physical systems into influences be- 
tween variables. Because the compositional modeller presented herein aims to serve as an ecological 
model repository, the contents of the model fragments employed differs from that of conventional 
compositional modellers in two important regards: 

Firstly, model fragments contain partial models describing certain phenomena instead of in- 
fluences. These partial models normally correspond to those developed in ecological modelling 
research. Typical examples include the logistic population growth model (Verhulst, 1838) and the 
Rolling predation model (Rolling, 1959) devised in the population dynamics literature. 

Secondly, the partial models contained in the model fragments often need to be composed incre- 
mentally. For example, the aforementioned sample model fragment logistic-population- 
growth requires an emerging scenario model, which may be generated by the other sample model 
fragment population-growth. Thus, one model fragment, e.g. logistic-population- 
growth, can expand on the partial model contained in another, e.g. population-growth. Be- 
cause of this feature, it is (correctly) presumed that no model fragment jj, generates new relations 
that are preconditions of model fragments that jj. expands on. Violating this presumption would 
make little sense in the context of the present application as it would imply a recursive extension of 
an emerging scenario model with the same set of variables and equations. 

3.2.4 Participant class declaration and participant type hierarchies 

In general, participant classes need not be defined. Rowever, certain types of participant may be 
described in terms of other interesting participants, irrespective of the modelling choices. This 
feature provides syntactic sugar for describing important relations between participants, making it 
easier to declare required properties of a scenario model in terms of the participants of the scenario. 
For example, the behaviour of a population may be described in terms of population size and growth 
rate variables: 

(defEntity population 

:participants (size growth-rate)) 
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Participant class declarations may also be employed within model fragments to provide a more 
specific definition of the meaning of the source-participants and the target-participants. In this way, 
participant specifications are constrained to be a feature of another participant by means of the 
: entity statement, as the following example illustrates: 

{defModelFragment define -population-growth-phenomenon 
: source-participants ((?p :type population)) 
: target-participants 

((?ps :type stock :entity (size ?p) ) 
{?pg :type variable :entity (growth-rate ?p) ) 
(?pb :type flow) 
(?pd :type flow) ) 
: assumptions ((relevant growth ?p) ) 
:postconditions ( (== ?pg (- ?pb ?pd) ) 
(flow ?pb source ?pl) 
(flow ?pd ?pl sink) ) ) 

Furthermore, participant class declarations may define one class to be an immediate subclass of 
another. For example, the population participant class of holometabolous insects (e.g. butterflies) 
may be defined as a subclass of the population participant class: 

(defEntity holometabolous -insect -population 
:subclass-of (population) 
:participants 

(larva-number pupa-number adult-number)) 

In this way, a participant type hierarchy is defined. Each subclass inherits all participants of its 
superclasses (i.e. its immediate superclass and superclasses of superclasses). 

In summary, a. participant class declaration is a tuple IT = {Us, P) where Us is a participant 
class, called the immediate superclass of the participant class and P is a set of participants classes 
that describe important features of the participant class. 

3.3 Inference 

The compositional modelling method presented herein employs a four step inference procedure: 

1 . Model space construction. The model space is an ATMS that efficiently stores all the partici- 
pants, relations and model design decisions (represented in the form of relevance and model 
assumptions) that may be part of the final scenario model, as well as the conditions under 
which each of these participants and relations must or must not be part of the scenario model. 

2. aDCSP construction. The model space contains a number of hard constraints on the partici- 
pants and relations that may be combined. This inference step extracts such restrictions and 
translates them into an aDCSP. 

3. Inclusion of order-of -magnitude preferences. Preferences are associated with relevance and 
model assumptions in the scenario space as they reflect the relative appropriateness of these 
assumptions, resulting in an aDPCSP 

4. Scenario model selection. This inference step solves the aDPCSP. The resulting solutions 
correspond to scenario models that are consistent according to the domain knowledge and 
optimise the overall preference with respect to the order-of-magnitude preference calculus. 
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Figure 4: Inference procedures of the compositional modeller 
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These four steps correspond to the four squares of the compositional model repository in Figure 4 

In this section, each of these inference steps is discussed in detail and illustrated by means of 
simple examples. The next section contains a more detailed example and shows how this procedure 
can be applied to a non-trivial ecological modelling domain. 

3.3.1 Scenario + Knowledge Base = Model Space 

As previously stated, the aim of a compositional modeller is to translate a scenario into a scenario 
model. Both are representations of the system of interest though they model the system at a different 
level of detail. The knowledge base provides the foundation for translation. All the scenario models 
that can be constructed from the given scenario, with regard to the knowledge base, are stored in the 
model space. 

A model space is an ATMS (de Kleer, 1986) containing all the participants, relations and as- 
sumptions that can be instantiated from a given scenario. In this work, the generalised version of 
the ATMS, as introduced by de Kleer (1988), is employed as it allows the use of negations of nodes 
in the justifications. The algorithm GENERATEModelSpace((0, R)) describes how such a model 
space can be created from a scenario {O, R). It first initialises the model space 9 with the partic- 
ipant instances (O) and the relation instances (R) from the scenario. Then, for each model frag- 
ment whose source-participants and structural conditions match participants and relations already 
in 9, new instances of its target-participants, assumptions and postconditions are added to 9. Be- 
cause each property definition {P^, <1>, tt) is equivalent to a model fragment {P^, {}, <1>, {vr}, {}, {}), 
this procedure applies to property definitions as well as model fragments. Matching the source- 
participants and structural conditions of a model fragment ^ to the emerging model space is per- 
formed by the function match(/i, 9, a) as specified below, where fi is the model fragment being 
matched, and cr is a substitution from the source-participants of fi to participant instances. 

true if a = {pl/oi,...,pf^/om}/\ 
P-^{fi) = {pl...,pfjA 
oi e 9 A . . . Aom ^ 9A 
V(^G <!>%fi),a(l)e9 

false otherwise 



match(^, 9,a) = < 



Each match, specified by a model fragment jj. and a substitution a, is processed as follows: 

• For each assumption a € A{fi), a new node, denoting the assumption instance aa, is created 
and added to 9. 

• Then, a new node «(cr,^)» denoting the instantiation of fi via substitution a, is created, added 
to 9 and justified by the implication: 

{^aeA{fj.)cra) A (Apgps(^)crp) A (A^g$s(^)O-0) ^ n(^_^) 



• 



Finally, a new instance for each target-participant p € P^ifj) and for each postcondition 
(f) G ^*(/u), provided a4> does not already exist in the model space 9, is created. For the 
target-participants, this involves creating a new symbol for each new participant instance with 
the function gensym() and extending a with the substitution {p/gensym()}. A new node n 
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Algorithm 1: generateModelSpace((0, i?)) 



do 



e ^ new ATMS; 
for each o £ O, add-node(6, o); 
for each r E R, add-node(S, r); 
for each /i, a, match(/i, 6, a) 
'justification ^- 0; 
for each a e A{fi) 

{newnode ^- add-node(^, (era)); 
justification ^- justification U {newnode}; 
for each p e P^ (/i) 

do justification ^- justification U {find-node(6', (ap))}; 
for each G <E>° (^) 

do justification «— justification U {find-node(6', (acfi))}; 
add-node(6l,n(<^_^)); 
do { add-justification(6l,n(CT_p), Angjustificationn); 
for each p e P* (fi) 

ia ^ aU {p/gensym()}; 
o ^- add-node(6', {crp)); 
add-justification(S, o, n(o.,^)); 
for each G $* {fi) 
(it{a(l>ee) 

then o ^- get-node(6', (o-(jf>)); 
else o ^- add-node(6, {iJ(t>)); 
add-justification(S, o, n(o-,^)); 
for each ni, . . . , n™, inconsistent({ni, . . . , rim) 
do add-justification(S, n±, ni A ... A rim); 



do < 



Instances of assumptions: 
Aifj.) = {ai,.. .,af}" 



Instances of source- 
participants:, 

P'{^i) = {pl,...,p';J 



Instances of structural 
conditions:, 
^■'{p) = {<i>l,...,(Pl} 



(70,1 






a at 






apl 




"Pn, 




aipl 




lyipl 



<^Plx 



^Pn 



<^<t>\ 



Instances of target 
participants: 

p*(/i) = {p5,...,p'j 



Instances of postconditions: 



Figure 5: Model fragment application 



is created and added to 6 for each new participant instance ap and for each new instantiated 
relation a(j). Each of these nodes is justified by the implication ^(cr,^) -^ n. 
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TT is a global property that must by satisfied 
by all consistent scenario models 




TT is a purpose-required property in a model fragment /i, 
and ji is applied with a substitution a. 





(a) Inconsistency caused by a 
global property 



(b) Inconsistency caused by a 
purpose-required property 



(c) Inconsistency caused by 
non-composable relations 



Figure 6: Sources of inconsistency 

To illustrate this procedure, Figure 5 shows a graphical representation of the inferences that are 
constructed by applying a model fragment fi = (P*, P*, $*, <!>*, A, {}) with respect to a substitution 
a. 

Once all possible applications of model fragments have been exhausted, the inconsistencies in 
the model space are identified and recorded in the ATMS. In the algorithm, nogoods are generated 
for each set {ni, . . . , Um} of inconsistent nodes, denoted inconsistent({ni, . . . , «„}). There are 
three sources of inconsistencies that are each reported to the ATMS in a different way: 

• Global properties: Let vr be an instance of a global property that any scenario model must 
satisfy. Then, any combination of assumptions and negations of assumptions that prevents vr 
from being satisfied is inconsistent. Therefore, inconsistent({^7r}) must be reported for any 
required global property it. This type of inconsistency is depicted in Figure 6(a). 

• Purpose-required properties: Any application of a model fragment ^ without satisfying its 
purpose-required properties n(^) yields an inconsistency (see (13)). Hence, for each node 
n(o-^^) denoting the instantiation of fi via substitution a, and for each node Uct-k describing the 
appropriate instance of a purpose-required property vr € n(/i), inconsistent({n(o-^), -^rifjT^}) 
is reported. This type of inconsistency is depicted in Figure 6(b). 

• Non-composable relations: In any mathematical formalism designed to describe simulation 
models of dynamic systems, certain combinations of relations may over-constrain the model, 
and hence, be unsuitable for generating the behaviour of a system of interest. Within the 
system dynamics and ODE formalisms used in this paper, assignments of relations to the 
same variable are only composable if those relations are explicitly deemed composable. In 



other words, two relations v = ri and v 



can only be combined with one another if rj 



and Vj are composable. Examples of pairs of non-composable relations include 

X = C^{y) and x = C^ {z) because C^ and C^ relations are not composable, and 
a = C^{h) and a = c-\- d because c + d is not a composable relation. 

Combinations of such non-composable relations must be reported as an inconsistency as well. 
This type of inconsistency is depicted in Figure 6(c). 
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assumption: 
(relevant growth frog) 
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population frog 



assumption: 

(model Tifrog logistic) 



population-growth 
model fragment 
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model fragment 
endogenous 
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relation: 

-iendogenous(cfiog) 



relation: 

^n.frng — Cfvng 
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(change-of Cf^g frog) 
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' ^ TT 
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(endogenous Cf,-og) 
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Figure 7: Partial model space 

To illustrate the model space construction algorithm, Figure 7 presents a small sample model 
space. It results from the application of the population-growth and logistic-popula- 
tion-growth model fragments and the endogenous property definition, which were described 
earlier, for a single population "frog". If a larger scenario involving multiple populations and rela- 
tions between these populations were specified, a similar partial model space would be generated 
for each individual population. 

3.3.2 From MODEL SPACE TO aDCSP 

Once the model space has been constructed, it can be translated into an aDCSP. The translation 
procedure, summarised as algorithm CREATEADCSP(), consists of three steps as described below: 



Algorithm 2: createaDCSP() 
comment: a is the set of substitutions 

comment: Generate attributes and domains 

for each A, assumption-class (^) 

' X <— create-attribute(); 
Dix) ^ {}; 
a ^ crU {A/x}; 

do i for each a e A 

{V ^~ create- value 0; 
D{x) ^ D(x) U {v}; 
a ^ a U {a/x : v}; 
comment: Generate activity constraints 

for each A, assumption-class (^) 

is ^- subject(A); 
for each {air, ■ ■ ■ , tipT, ^^ix, ■ ■ • , ^aq±} G 'C(s) 
do add(craiT A ... A aOpT A cr- laix A ... A cr-iagx -^ active(aA)); 
comment: Generate compatibility constraints 

for each {an, . . . ,apT,^ai±, . . . ,^aq±} £ C{n±) 
do add(o-aiT A ... A aapT A cr- laix A ... A a^aq± -^ ±); 
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1 . Generate the attributes and domain values from the assumptions. The aDCSP attributes corre- 
spond to the underlying assumption classes (i.e. groups of assumptions indicating alternative 
choices with regards to the same model construction decision). A relevance assumption and 
its negation jointly form an assumption class. For example, Ai ={ (relevant growth 
frog) , -1 (relevant growth frog) } specifies such an assumption class. The set of 
model assumptions involving the same participants/relations, but with different model names 
and hence different descriptions, also form an assumption class. For instance, A2 ={ (model 
nfrog exponential) , (model nfrog logistic) , (model n^og other) }, where 
nfrog is a variable denoting the size of a population, specifies such an assumption class. Run- 
ning this step of the algorithm, an attribute is created for each assumption class, with the 
domain of such an attribute consisting of all assumption instances in the assumption class. 

2. Create activity constraints. The attributes and domain values generated in the previous step 
are only meaningful in situations where the participant and/or relation instances contained in 
the arguments of the corresponding assumptions exist. For example, the assumption (model 
nfrog logistic) is only relevant if the participant instance nftog exists. Clearly, all as- 
sumptions within one assumption class have the same participant and/or relation instances as 
their arguments. Because each assumption class corresponds to one attribute, the attribute 
can be activated if and only if the participant and/or relation instances associated with the re- 
lated assumption class are active. Therefore, this step creates activity constraints that activate 
an attribute based on the conjunction of the environments contained within the labels of the 
participants/relations of the assumption class. For instance, as can be deduced from Figure 
7, nfrog is activated when (relevant growth frog) is committed. Thus, the attribute 
corresponding to assumption class A2, defined in step 1, is activated under the attribute value 
assignment associated with the (relevant growth frog) assumption. 

3. Create compatibility constraints. In the ATMS (or model space), all sources of inconsisten- 
cies are contained in the label of the nogood node. Therefore, the compatibility constraints 
are created directly by translating the environments in the label C{1.) into the corresponding 
conjunctions of attribute-value assignments. 

3.3.3 ADCSP -I- PREFERENCES = aDPCSP 

The aDCSP produced as above formalises the hard requirements imposed upon the scenario models. 
Among the scenario models that meet these requirements, some may be better than others, because 
the underlying model design decisions may be deemed more appropriate by the user. Preferences 
that express this (relative) level of appropriateness are attached to the assumptions that describe the 
model design decisions, and by extension, to the attribute-value pairs in the aDCSP As discussed in 
section 2, such an extension to the aDCSP constitutes an aDPCSP. 

More specifically, it is worth recalling that in section 2.2 an order-of-magnitude preference 
calculus is presented that enables representation and reasoning with subjective user preferences for 
different relevance and modelling assumption. Next, section 2.3 introduces a solution algorithm for 
aDPCSPs that include an aDCSP, such as the ones constructed with the approach of section 3.3.2, 
and are extended with subjective user preferences for alternative design decisions. 
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3.4 Outline analysis of complexity 

The complexity of the work arises from four major sources: 1) model space construction, 2) label 
propagation in the ATMS, 3) model space to aDCSP translation, and 4) aDPCSP solution. 

GENERATEMODELSPACE((0,i?)) essentially performs a fixed sequence of instructions and 
produces a small set of nodes and inferences for each match of a model fragment. Therefore, its 
time and space complexity are linear with respect to the number of possible matches of model 
fragments. CREATEADCSP() extracts certain information from the model space and rewrites it in 
a different formalism without further manipulations. Therefore, its time and space complexity are 
linear with respect to the size of the model space. 

The label propagation algorithm of an ATMS is known to have an exponential time complexity. 
However, because the model space is built up incrementally (by generateModelSpace((0, R))) 
from the root nodes of the ATMS network (i.e. those that correspond to facts and have no an- 
tecedents) to the leaf nodes (i.e. those that have have no consequents, other than the nogood node) 
and because the inconsistencies are added at the end, this complexity only increases exponentially 
with the depth of the network and the number of participants and relations in individual model frag- 
ments, rather than with the size of the model space. This fact significantly limits the complexity 
impact of label propagation. Firstly, the depth of the ATMS network is restricted by the domain. 
In many conventional compositional modellers, where model fragments are direct translations from 
scenario components to scenario model equations, this depth would be only one. Empirically, con- 
structing the model space for sophisticated eco-systems, the depth of a model space never exceeded 
8. Secondly, the size of the individual model fragments does not change significantly with the size 
of the knowledge base. 

The fourth and final source of complexity is driven by the fact that the constraint satisfaction 
algorithm must determine a consistent combination of assumptions in the model space. The space 
of attribute value assignments increases exponentially with the size of the number of assumptions 
and hence, with the model space. Thus, the overall complexity of the present approach is largely 
dominated by the constraint satisfaction algorithm employed. 

If the user does not specify any preference, the CSP is an aDCSP Recently, a number of efficient 
methods have been devised for solving aDCSPs as presented by Minton et al. (1992); Mittal and 
Falkenhainer (1990); and Verfaillie and Schiex (1994). This helps minimise the overhead incurred 
for compositional modelling. 

With preferences, the CSP becomes an aDPCSP. As argued in section 2, this presents a new 
problem that has not yet been studied in detail. In this work, an A* algorithm has been proposed to 
implement the CSP solution method. This approach is known to be the most efficient in terms of 
the proportion of the search space the algorithm needs to explore before finding an optimal solution, 
when compared to other search methods that are based on the same heuristic (Hart et al., 1968). A 
disadvantage is that it incurs an exponential space complexity. As explained by Miguel and Shen 
(2001a, 2001b); and Tsang (1993), a wide range of alternative solution techniques exist for ordinary 
CSPs and many of these could also be extended to solve aDPCSPs. A detailed examination of these 
techniques is a topic of future research. 

3.5 Automated modelling and scientific discovery 

As mentioned previously, a compositional model repository is designed in order to compose models 
from a system's structure and relevant domain knowledge. As such, this approach gives rise to a po- 
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tentially beneficial means to operationalise the outcomes of scientific discovery. More specifically, 
the resultant compositional model repositories will allow existing knowledge on model construction 
to be applied to unexperienced scenarios and to support investigation into situations which may be 
physically difficult to replicate or create but which may be synthesised in computational represen- 
tations. 

The present work has been applied to the vegetation component of the MODMED n-species 
model (Legg, Muetzelfeldt, & Heathfield, 1995). This n-species model offers a system dynamics 
representation of populations of Mediterranean vegetations and of how they are affected by popu- 
lations of farm animals, climate and environmental management. The purpose of the model is to 
be instantiated with respect to various Mediterranean communities, and to serve as a component 
of a very large scale simulation that is designed to simulate the effects of various environmental 
policies on the Mediterranean landscape. A knowledge base containing approximately 60 model 
fragments and 4 property definitions has been constructed, on the basis of the most complex parts 
of the n-species model in about two man-weeks. This knowledge base can be employed to recon- 
struct variations of the n-species model to accommodate a variety of possible scenarios, as well as 
to examine simplifications of the original n-species model which exclude certain phenomena. 

The compositional model repository is most closely related to the seminal work on compo- 
sitional modelling (Falkenhainer & Forbus, 1991). That approach has a similar functionality but 
it is devised specifically for physical systems and relies on a component-connection formalism to 
represent scenarios. 

Another approach which has recently been developed and applied to the ecological domain by 
Heller and Struss (1998, 2001). This work derives a system's structure from observations of its 
behaviour and domain knowledge. Therefore, it is able to perform diagnosis of ecological systems 
and therapy suggestion. Another important distinction of this work from the present study is that 
it presumes that each process can only be described in just one way instead of allowing multiple 
alternative models. 

In the machine learning community, a number of approaches have been devised by Bradley, 
Easley and Stolle (2001); Langley et al. (2002); and Todorovski and Dzeroski (1997, 2001) to 
induce sets of differential equations from a) observations of behaviour, b) domain knowledge rep- 
resented in the form of hypothetical equations, and c) a description of the structure of the system. 
These approaches aim at scientific discovery by generalising observed behaviour into mathematical 
models. The specifications of the scenario and the domain knowledge in these methods are similar 
to those used in this article. This is especially true for the work by Langley et al. (2002); and Todor- 
ovski and Dzeroski (1997, 2001), because that work has also been applied to population dynamics. 
However, the internal mechanisms of these approaches are very different as they essentially rely on 
exhaustive search procedures instead of constraint satisfaction techniques. 

4. A Population Dynamics Example 

The examples used throughout the previous sections were taken from a more extensive application 
study of the present work. The application was aimed to construct a repository of basic population 
dynamic models, describing the phenomena of growth, predation and competition. This section 
presents an overview of how the proposed approach is employed in this application to show the 
ability of the work to scale to larger problems. 
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4.1 Knowledge base 

This subsection illustrates how a set of model fragments can be constructed. The challenge of 
this task lies in the fact that model fragments must encompass a sufficiently general and reusable 
component part of the ecological models. In instances of models found in the literature on ecological 
modelling, the boundaries of the recurring component parts are hidden, and it is therefore up to the 
knowledge engineer to identify them. 

First, a hierarchy of entity types is set up. The system dynamics models shown earlier contain 
only three types of participant: variables, stocks and flows. Here, stocks and flows are a special 
type of variable with a predetermined meaning. That is, a flow / into a stock s corresponds to the 
equation ^s = C^{f) and a flow / out of a stock s denotes ^s = C^{f). Hence, stocks and 
flows are defined as subclasses of the participant class variable: 

(defEntity variable) 
(defEntity stock 

:subclass-of (variable)) 
(defEntity flow 

:subclass-of (variable)) 

The sample properties defined in section 3.2.3, which describe the condition under which a 
variable is endogenous or exogenous, are employed in this knowledge base: 

(defproperty endogenous-1 

: source-participants ((?v :type variable)) 
: structural-conditions ( (== ?v *)) 
:property (endogenous ?v) ) 

(defproperty endogenous-2 

: source-participants ((?v :type variable)) 
: structural-conditions ( (d/dt ?v *)) 
:property (endogenous ?v) ) 

(defproperty exogenous 

: source-participants ((?v :type variable)) 

: structural-conditions ((not (endogenous ?v) ) ) 

:property (exogenous ?v) ) 

The next three model fragments contain the rules of the stock-flow diagrams employed by sys- 
tems dynamics models. They respectively describe that: 

• A flow ? f low into a stock ? stock corresponds to the composable differential equation: 

— ?stock = C+(?flow) 
dt ^ ' 

• A flow ? flow out of a stock ? stock corresponds to the composable differential equation: 

— ?stock = C"(?flow) 
dt ^ ' 

• A flow ?flow from one stock ?stockl to another stock ?stock2 corresponds to the 
composable differential equations: 

— ?stockl = C"(?f low) and — ?stock2 = C+(?f low) 
dt ^ ' dt ^ ' 
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(defModelFragment inflow 
: source-participants 
((?stock :type stock) 
(?flow :type flow)) 
: structural-conditions 

((flow ?flow source ?stock) ) 
: post conditions 

((d/dt ?stock (C-add ?flow) ) ) ) 

(defModelFragment outflow 
: source-participants 
((?stock :type stock) 
(?flow :type flow)) 
: structural-conditions 

((flow ?flow ?stock sink)) 
: postconditions 

((d/dt ?stock (C-sub ?flow) ) ) ) 

(defModelFragment inflow 
: source-participants 
((?stockl :type stock) 
(?stock2 :type stock) 
(?flow :type flow)) 
: structural-conditions 

((flow ?flow ?stockl ?stock2)) 
: postconditions 

( (d/dt ?stockl (C-sub ?f low) ) 
(d/dt ?stock2 (C-add ?f low) ) ) ) 

Once the above declarations are in place, the knowledge base of model fragments can be de- 
fined. The first model fragment describes the population growth phenomenon. Note that all of the 
aforementioned growth, predation and competition models contain a stock representing population 
size and two flows, one flow of births into the stock and another flow of deaths out of the stock. This 
common feature of models on population dynamics is contained in a single model fragment. 

(defModelFragment population-growth 
: source-participants 

( (?population :type population)) 
: assumptions 

( (relevant growth ?population) ) 
: tar get -participants 

((?size :type stock : name size) 
(?birth-flow :type flow : name births) 
(?death-flow :type flow : name deaths)) 
: post conditions 

((flow ?birth-flow source ?size) 
(flow ?death-flow ?size sink) 
(size-of ?size ?population) 
(births-of ?birth-flow ?population) 
(deaths-of ?death-flow ?population) ) 
: purpose -required 

( (endogenous ?birth-flow) 
(endogenous ?death-flow) ) ) 

The variables ?birth-f low and ?death-f low become endogenous if the model contains 
an equation describing birth flow and death flow. These equations differ between population growth 
models. Two types of population growth model are the exponential growth model (Malthus, 1798), 
which is shown in Figure 8(a), and the logistic growth model (Verhulst, 1838), which is shown in 
Figure 8(b). The following two model fragments formally describe these component models: 
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(a) Exponential growth (b) Logistic growth 

Figure 8: Population growth models 

(defModelFragment exponential -population-growth 
: source-participants 

( (?population :type population) 
(?size :type variable) 
(?birth-flow itype variable) 
{?death-flow :type variable)) 
: structural-conditions 

( (size-of ?size ?population) 
(births-of ?birth-flow ?population) 
{deaths-of ?death-flow ?population) ) 
: assumptions 

{{model ?size exponential)) 
: tar get -participants 

{ { ?birth-rate :type variable :name birth-rate) 
{?death-rate :type variable :name death-rate)) 
: postconditions 

{ {== ?birth-flow (* ?birth-rate ?size) ) 
{== ?death-flow (* ?death-rate ?size) ) ) ) 

{defModelFragment logistic-population-growth 
: source-participants 

{ {?population :type population) 
{?size :type variable) 
{?birth-flow :type variable) 
{?death-flow :type variable)) 
: structural-conditions 

{ {size-of ?size ?population) 
{births-of ?birth-flow ?population) 
{deaths-of ?death-flow ?population) ) 
: assumptions 

{ {model ?size logistic) ) 
: tar get -participants 

{ { ?birth-rate :type variable :name birth-rate) 
{?death-rate :type variable :name death-rate) 
{?density :type variable :name total-population) 
{?capacity :type variable :name capacity)) 
: postconditions 

{ {^^ ?birth-flow (* ?birth-rate ?size) ) 
{== ?death-flow (* ?death-rate ?size ?density) ) 
{== ?density {C-add (/ ?size ?capacity) ) ) 
{density-of ?density ?population) 
{capacity-of ?capacity ?population) ) ) 

There is one twist in compositional modelling of population growth. Sometimes, the actual 
growth model is implicitly contained within another type of model. In such cases, the growth 
phenomenon and the corresponding differential equations are still relevant, but none of the dedicated 
growth models can be employed. For example, as will be shown later, the Lotka-Volterra predation 
model comes with its own equations describing growth. 
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The model fragment other-growth allows for an empty growth model, named other, to 
be selected. However, due to the purpose-required property that any instance of ?p-change must 
be endogenous, this empty model can only be selected if a growth model is implicitly included 
elsewhere. 

(defModelFragment other-growth 
: source-participants 

( (?population :type population) 
(?size :type variable) 
(?birth-flow itype variable) 
(?death-flow :type variable)) 
: structural-conditions 

((size-of ?size ?population) 
(births-of ?birth-flow ?population) 
(deaths-of ?death-flow ?population) ) 
: assumptions 

( {model ?population other) ) ) 

In addition to population growth, two other phenomena are included in the knowledge base: 
predation and competition. Predation and competition relations between species are represented by 
predicates over the populations: e.g. (predation foxes rabbits) and (competition 
sheep cows ) . However the existence of a phenomenon does not necessarily mean that it must be 
contained within the model. It would make little sense to model predation and competition without 
modelling the size of the populations, because models of these phenomena relate population sizes 
to one another. Therefore, the incorporation of the predation phenomenon is made dependent upon 
the existence of variables representing population size. Also, human expert modellers may prefer 
to leave a phenomenon out of the resulting model. To keep this choice open, the following two 
model fragments construct a participant representing the phenomena of predation and competition, 
and make it dependent upon a relevance assumption: 

(defModelFragment predat ion-phenomenon 
: source-participants 

((?predator :type population) 
(?prey :type population) 
(?predator-size :type variable) 
(?prey-size :type variable)) 
: structural-conditions 

( (predation ?predator ?prey) 
(size-of ?predator-size ?predator) 
(size-of ?prey-size ?prey) ) 
: assumptions 

( (relevant predation ?predator ?prey) ) 
: tar get -participant 

( ( ?predation-phenomenon :type phenomenon :name predation-phenomenon) ) 
: postconditions 

( (predation-phenomenon ?predation-phenomenon ?predator ?prey) ) 
: purpose-required ( (has-model ?predation-phenomenon) ) ) 

(defModelFragment competition-phenomenon 
: source-participants 

( (?populationl :type population) 
(?population2 :type population) 
(?sizel :type variable) 
(?size2 :type variable)) 
: structural-conditions 

((competition ?populationl ?population2 ) 
(size-of ?sizel ?populationl) 
(size-of ?size2 ?population2) ) 
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(a) Lotka-Volterra predation 



(b) HoUing predation 



Figure 9: Predation models 



: assumptions 

{{relevant competition ?populationl ?population2) ) 
: target-participant 

{ { ?competition-phenomenon :type phenomenon :name competition-phenomenon)) 
: postconditions 

{{competition-phenomenon ?competition-phenomenon ?populationl ?population2 ) ) 
: purpose -required 

{ {has-model ?competition-phenomenon) ) ) 

Both model fragments have a purpose-required property of the form (has-model ?phen) . 
This property expresses the condition that a model must exist with respect to a phenomenon: 

(defproperty has-model 

: source-participants ((?p :type phenomenon)) 
: structural-conditions ( (is-model-of ?p *)) 
:property (has-model ?p) ) 

The next two model fragments implement such models (thereby satisfying the above ha s -mo del 
purpose-required property) for the predation phenomenon between two populations. They describe 
two well-known predation models: the Lotka-Volterra model (1925, 1926), which is shown in Fig- 
ure 9(a), and the Rolling model (1959), which is shown graphically in Figure 9(b). 

(defModelFragment Lotka-Volterra 
: source-participants 

{ { ?predation-phenomenon :type phenomenon) 
{?predator :type population) 
{?predator-size :type stock) 
{ ?predator-birth-f low :type flow) 
{?predator-death-f low :type flow) 
{?prey :type population) 
{?prey-size :type stock) 
{ ?prey-birth-f low :type flow) 
{ ?prey-death-f low :type flow)) 
: structural-conditions 

{ {predation-phenomenon ?predation-phenomenon ?predator ?prey) 
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(size-of ?predator-size ?predator) 

(births-of ?predator-birth-f low ?predator ) 

(deaths-of ?predat or- death- flow ?predator ) 

{size-of ?prey-size ?prey) 

(births-of ?preY-birth-f low ?prey) 

{deaths-of ?prey-death-f low ?prey) ) 
: assumptions 

{ {model ?predation-phenomenon lotka-volterra) ) 
: target-participants 

{ { ?prey-birth-rate :type variable : name birth-rate) 

{?predat or- factor : type variable : name predator- fact or) 

{?prey-factor : type variable :name prey- factor) 

{?predator-death-rate : type variable : name death-rate) ) 
: postconditions 

( (^^ ?prey-birth-f low (* ?prey-birth-rate ?prey-size)) 

(^^ ?predator-birth-f low (* ?predator-f actor ?prey-size ?predator-size) ) 

{-^ ?prey-death-f low (* ?prey-factor ?prey-size ?predator-size) ) 

{=^ ?predator-death-f low (* ?predator-death-rate ?predator-size) ) 

{is-model-of lotka-volterra ?predation-phenomenon) ) ) 

As mentioned earlier, the Lotka-Volterra model introduces its own growth model for the prey 
and predator populations by assigning specific equations to the variables, which describe changes in 
the sizes of the predator and prey populations, ?pred-change and ?prey-change respectively. 
Thus, it satisfies the purpose -required property in the application of the population-growth 
model fragment for the ?prey and ?pred populations. 

(defModelFragment Rolling 
: source-participants 

{ { ?predation-phenomenon :type phenomenon) 
{?predator : type population) 
{?predator-size : type stock) 
{?capacity :type variable) 
{?prey : type population) 
{?prey-size :type stock) ) 
: structural-conditions 

{ {predation-phenomenon ?predation-phenomenon ?predator ?prey) 
{size-of ?predator-size ?predator) 
{size-of ?prey-size ?prey) 
{capacity-of ?capacity ?predator) ) 
: assumptions 

{ {model ?predation-phenomenon holling) ) 
: target-participants 

{ {? search- rate : type variable :name search-rate) 
{ ?handling-time :type variable : name handling-time) 
{ ?prey-requirement :type variable : name prey-requirement) 
{?predation :type flow : name predation) ) 
: postconditions 

{{flow ?predation ?prey-size sink) 
{=^ ?predation 

(/ (* ?search-rate ?prey-size ?predator-size) 

(+ 1 {* ?search-rate ?prey-size ?handling-time) ) ) ) 
{=^ ?capacity {C-add (* ?prey-requirement ?prey) ) ) 
{is-model-of holling ?predation-phenomenon) ) ) 

The Holling model employs a variable denoting the capacity of a population. Such a variable 
may be introduced by a logistic growth model. In practice, logistic growth models and Holling 
predation models are often used in conjunction. The compositional modeller need not be aware of 
such combinations of models, however. All it needs to know is the prerequisites of the individual 
component models contained within each model fragment. 
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Figure 10: A species competition model 



The final model fragment in the knowledge base implements a model of competition between 
two species. It formally describes the competition model type depicted in Figure 10. As this model 
fragment contains the only population competition model in the knowledge base, it does not contain 
a model assumption to represent the model. 



{defModelFragment competition 
: source-participants 

( ( ?competition-phenomenon :type phenomenon) 
(?population-l :type population) 
{?size-l :type stock) 
(?density-l :type variable) 
(?capacity-l :type variable) 
(?population-2 :type population) 
{?size-2 :type stock) 
(?density-2 :type variable) 
(?capacity-2 :type variable)) 
: structural-conditions 

((competition-phenomenon ?competition-phenomenon ?population-l ?population-2 ) 
(density-of ?density-l ?size-l) 
(capacity-of ?capacity-l ?size-l) 
(density-of ?density-2 ?size-2) 
(capacity-of ?capacity-2 ?size-2)) 
: assumptions 

((relevant competition ?population-l ?population-2) ) 
: target-participants 

((?weight-12 :type variable :name weight) 
(?weight-21 :type variable :name weight)) 
: postconditions 

( (== ?density-l (C-add (/ (* ?weight-12 ?size-2) ?capacity-l) ) ) 
(== ?density-2 (C-add (/ (* ?weight-21 ?size-l) ?capacity-2) ) ) ) ) 
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competition 
prey1,prey2 



Figure 1 1 : Model space for the 1 predator and 2 competing prey scenario 



4.2 Model space 

A model space is constructed when the knowledge base is instantiated with respect to a given sce- 
nario. Consider for example the following scenario, which describes a predator population that 
preys on two other populations, preyl and prey2, whilst the two prey populations compete with 
one another: 

{defScenario pred-prey-prey- scenario 
:entities ((predator :type population) 
(preyl :type population) 
(prey2 :type population)) 
:relations ( (predation predator preyl) 
(predation predator prey2) 
(competition preyl prey2))) 

The full specification of the model space is too unwieldy to present here but an abstract graphical 
representation of the model space for this scenario is shown in Figure 1 1 . This model space contains 
the following knowledge: 

• From each of the three populations in the scenario, a set of three population growth models 
(i.e. exponential, logistic and other) is derived. This inference is dependent upon 
a relevance assumption of the population growth phenomenon, and a model assumption that 
corresponds to one of the three population growth models. 
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• From both predation relations (i.e. (predation predator preyl) and (predation 
predator prey2 ) ), and the populations related by them, a set of two predation models 
(i.e. Lotka-Volterra and Rolling) is derived. This inference is dependent upon a rel- 
evance assumption of the predation phenomenon and a model assumption that corresponds to 
one of the two predation models. 

• From the competition relation (competition preyl prey2) , and the populations re- 
lated by it, a competition model is derived. Because there is only one competition model, 
the inference of the competition model is only dependent upon a relevance assumption that 
corresponds to the competition phenomenon. 

In addition to the hypergraph of Figure 1 1 , the model space also contains a number of constraints 
on the conjunctions of assumptions that are consistent. As explained earlier, these stem from two 
sources: 1) non-composable relations and 2) purpose-required properties. An example will be given 
of each type. 

Let predation-phen-1 be the predation phenomenon between predator and preyl, 
and preyl-size be the variable representing the size of the preyl population. In this ex- 
ample, the model fragments exponential-population-growth and Lotka-Volterra 
will each generate an equation for computing the value of a variable representing the change in 
preyl-size. Because both equations can not be composed, the following inconsistency is gen- 
erated: 

(relevant growth preyl) A (model preyl-size exponential) A 
(relevant growth predator) A (relevant predation predator preyl) A 
(model predation-phen-1 lotka-volterra) -^ _L 

Inconsistencies also arise from purpose-required properties. For example, if the model frag- 
ment predation-phenomenon is applicable and the predation relation is deemed relevant, then 
the purpose-required property (has-model ?pred-phen) will become a condition for consis- 
tency. Under certain combinations of assumptions, this property may not be satisfied. Say, when the 
HoUing predation and exponential growth models are both selected, the Rolling model is not gener- 
ated because there is no ? capacity for which (capacity ?capacity ?pred) is true. No 
predation model is created in this case (because the Rolling model fragment can not be instanti- 
ated), even though the predation phenomenon is deemed relevant under this set of assumptions. This 
is inconsistent with the has-model purpose-required property in the predation-phenomenon 
model fragment, and the responsible combination of assumptions is therefore marked as nogood. 

(relevant growth predator) A (model predator-size exponential) A 

(relevant growth preyl) A (model preyl-size exponential) A 

(relevant predation predator preyl) A (model predation-phen-1 holling) ^ _L 



4.3 aDPCSP and solution 

The resultant model space is translated into an aDCSP to enable the selection of a consistent set of 
assumptions, using advanced CSP solution techniques. The aDCSP derived from the above model 
space is depicted in Figure 12. 
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Attribute 


Meaning 


Xl 


(relevant 


growth preyl) 


X2 


(relevant 


growth prey2) 


x-i 


(relevant 


growth predator) 


X4 


(relevant 


predation predator preyl) 


Xb 


(relevant 


predation predator prey2) 


Xq 


(relevant 


competition preyl prey2) 


Xj 


(model size-1 *) 


X8 


(model size-2 *) 


Xg 


(model size-3 *) 


XlO 


(model predation-phen-1 *) 


Xii 


(model predation-phen-2 *) 



Table 4: Attribute list 



Domain 


Content 


Meaning 


Di 


{di,y,di,„} 


{population,none} 


D2 


{d2,y,d2,n} 


{population,none} 


D3 


{d3,y,d3,n} 


{population,none} 


Di 


{(i4,j/,rf4,n} 


{(population,population),none} 


D5 


{d5,y,d5,n} 


{ (population,population) ,none } 


Da 


{d6,y,dQ^n} 


{(population,population),none} 


D-j 


{d7,hd7^e,d7^o} 


{logistic,exponential,other} 


Ds 


{d8,hd8,e,d8,o} 


{logistic,exponential,other} 


D9 


{d9,udg^e,dg^o} 


{logistic,exponential,other} 


Dio 


{dlO,h,dio^lv} 


{Holling,Lotka-Volterra} 


Dn 


{dii,hidii^lv} 


{Holling,Lotka-Volterra} 



Table 5: The aDCSP for the 1 predator and 2 competing prey scenario: domains and their contents 
and meaning 



This aDCSP contains 1 1 attributes. They are listed with the corresponding assumption classes 
in table 4. The first 6 attributes correspond to the notion of relevance phenomenon: 3 population 
growth phenomena, 2 predation phenomena and 1 competition phenomenon to be precise. The other 
5 attributes correspond to 5 sets of model types: 3 sets of population growth models and 2 sets of 
predation models. 

The assumptions from which the attributes were generated form domains of values. The result- 
ing domains of the aforementioned attributes are summarised in table 5. 

The activity constraints in the aDCSP describe the conditions that instantiate the subject of the 
assumptions that correspond to an attribute. Since each participant or relation has a label in the 
model space, a minimal set of assumptions under which it becomes part of the emerging model 
is available. When a participant or relation is the subject of an assumption, this label explicitly 
describes the sets of assumptions under which the attribute that corresponds to that subject should 
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value 



^ ^ compatibility constraint 
"^ activity constraint 

Figure 12: aDCSP derived from the models space reflecting the 1 predator and 2 competing prey 
scenario 



be activated. By translating the label of a subject into sets of attribute-value assignments, the an- 
tecedents of the activity constraints are constructed. 

In this example, the relevance assumptions (attributes xi, . . . , xq) take their subjects from the 
scenario, and hence, they are always active. The attributes related to the model assumptions for 
population growth are active if the corresponding assumptions denoting relevance of population 
growth are true. That is, 

xi : di^y -^ active(x7) 
X2 : d2,y -^ active(x8) 
X3 ■ dz,y -^ active(x9) 

The attributes related to the assumptions about the predation models are active if the corresponding 
assumptions denoting relevance of predation, and the assumptions describing relevance of popula- 
tion growth, are true for the populations involved in the predation relation. That is, 

xi : di^y A X3 : d^^y A X4 : ^4^^ -^ active(xio) 
X2 : d2,y A X3 : d^^y A X5 : d^^y -^ active(xii) 

Figure 12 shows a graphical representation of these activity constraints. 

The compatibility constraints correspond directly to the inconsistencies in the nogood node. 
These inconsistencies have been discussed in the previous section and are depicted in Figure 12. 

Once the aDCSP is constructed, preferences may be attached to attribute-value assignments. 
Suppose that preferences are only assigned to the standard population modelling choices, i.e. expo- 
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Attribute 


Preference assignments 


Xi,...,X5 


no preference assignments 


Xq 


J~^\Xq '. UQ^y) = ^competition 


X7 


r(Xj : UT^l) = Plogistio ^(^7 • "7,ej = ^^exponential 


X8 


P{x% : ds,l) = Plogistic, P{X8 '■ ds,e) = ^exponential 


X9 


P{xg : dg^i) = ^logistic, P{X9 : dg^e) = ^exponential 


a;io 


P(xio : dio,h = PhoUing, PixiO : dlO.to) = Plotka-vdterra 


xn 


P{xn ■ dii^h = PhoUing, Pixu : dn^i^) = Plotka-vdterra 



Table 6: Preference assignments for the 1 predator and 2 competing prey problem 

nential growth, logistic growth, lotka-volterra predation and hoUing predation, and to the relevance 
of competition (because only one type model has been implemented for this phenomenon). For 
example, the following BPQs could be employed: 

^exponential ^ ^logistic 

Plotka-volterra ^ PhoUing 

^competition 

The logistic and Rolling models are preferred over the exponential and Lotka-Volterra models be- 
cause the former are generally regarded as being more accurate. Note that the preferences have 
been ordered in such a way that those corresponding to different phenomena are not related to one 
another. The justification for this ordering is that, even though the models are structurally connected 
(there are restrictions over which models can combined with one another), models of different phe- 
nomena inherently describe behaviours that can not be compared with one another. The preference 
assignments for attribute value assignments are summarised in table 6. 

Solving this aDPCSP is simple. First, the attributes xi, . . . , xg are activated. Each of these 
attributes is assigned Xj : di^y because that assignment maximises the potential preference. Then, 
the attributes xy, . . . , xn are activated. Here, attributes xy, . . . , xg are assigned Xj : di^i because the 
logistic growth model has the highest preference. Finally, xio and xn are assigned xio : (iio,fe and 
xu : dii^h because the Rolling models have the highest preference and are not inconsistent with the 
logistic model committed earlier. The resulting solution satisfies the following set of assumptions: 

{(relevant growth preyl), 
(relevant growth prey2), 
(relevant growth predator) , 
(relevant competition preyl prey2), 
(relevant predation predator preyl), 
(relevant predation predator prey2), 
(model size-1 logistic) , 
(model size-2 logistic) , 
(model size-3 logistic) , 
(model predation-phen-1 boiling) , 
(model predation-phen-2 boiling) } 
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Figure 13: Deducing a scenario model from the model space, given a set of assumptions 



4.4 Sample scenario model 

Figure 13 shows how a scenario model can be deduced from the above set of assumptions by ex- 
ploiting the model space. The nodes corresponding to the aforementioned assumptions and those 
that logically follow from the assumption set are indicated in the Figure. 

When combining the participants and relations in the resulting scenario model, the model given 
in Figure 14 can be drawn. This model corresponds to the one that an ecologist would draw if the 
logistic growth and HoUing predation models were regarded to be appropriate for the task at hand. 

5. Conclusion and Future Work 

This article has presented a novel approach to compositional modelling that enables the construction 
of models of ecological systems. This work differs from existing approaches in that it automatically 
translates the compositional modelling problem into an aDCSP with (order-of-magnitude) prefer- 
ence valuations. There are several benefits to this method. 

The use of a translation algorithm that converts the compositional modelling problem into an 
aDCSP allows criteria to be formalised. More importantly, it also enables efficient, existing and 
future, aDCSP solution techniques to be effectively applied to solving compositional modelling 
problems. 
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Figure 14: Sample scenario model for the 1 predator and 2 competing prey scenario 
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The extension of the aDCSPs with (order-of-magnitude) preferences (to form aDPCSPs) also 
permits the incorporation of softer requirements in the compositional modelling problem. In this 
paper, order-of-magnitude preferences have been employed to express the appropriateness of alter- 
native model types for certain phenomena. While such considerations may be described by hard 
constraints in the physical systems domain^, they are more subjective in less understood problem 
domains, such as the ecological modelling domain. The approach presented herein provides a means 
to capture and represent the subtlety of the flexible model design decisions. 

The theoretical ideas presented in this article have been applied to real-world ecological mod- 
elling problems. In this paper, it has been demonstrated how the resultant compositional modeller 
can be employed to create a repository of population dynamics models. The approach has also been 
applied to automated model construction of large and complex ecosystems such as the MODMED 
model of Mediterranean vegetation (Legg et al., 1995), as reported by Keppens (2002). 

There are some practical and theoretical issues that need to be addressed, however. On the prac- 
tical side, the types of ecological model design decisions, as represented by the assumptions and 
assumption classes, and as supported by the inference mechanisms, should be extended. Ecological 
systems tend to involve interrelated populations of individuals, instead of functional compositions of 
individual components as with physical systems. One particularly important type of design decision 
in ecological modelling is therefore granularity. This requires the introduction of novel representa- 
tion formalisms and inference mechanisms such as aggregation and disaggregation. Initial work for 
considering populations as single entities and for dividing such entities into sub-populations when 
necessary has been carried out (Keppens & Shen, 2001a). Integration of such work into the present 
aDPCSP framework requires further investigation. 

On the theoretical side, the analysis of the complexity of the present approach is rather informal. 
Much remains to be done in this regard, especially when comparing to the complexity of existing 
compositional modellers. For this comparison, additional work will be required to adapt the cur- 
rent translation procedure to suit existing compositional modelling problems. Most compositional 
modellers are of exponential complexity, however. As they employ problem-specific solution algo- 
rithms, little is known about opportunities for improving their efficiency. This work hopes to be a 
first step toward further understanding this important issue. 
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